Gather & Index
Lastest in Gather & Index
- Controlling what is indexed
- Update failures
- Crawl Wordpress sites
- Delimited text, XML and JSON data sources
- Designing and building a web search
- Configure Funnelback to index additional file types
- Sharepoint
- Social media API keys
- Ignore canonical links in web pages
- Padre-fl: clear or set arbitrary document flags
- Configure filecopy collecton log level
- Debug a crawl for missing documents
- WebDAV sites
- FTP sites
- Confluence
- Splitting XML files
- Read a file from a filter
- Post-gather filtering
- Crawling paginated XML or JSON
- Include binary documents in the search index
- Installing database drivers
- Workflow commands on Windows
- Manually building result collapsing
- Character encoding: custom workflow scripts
- Download configuration files via workflow
- Downloading configuration files via workflow
- Using the Funnelback index to generate gscope and kill configuration files
- Configuring instant updates
- Indexing and searching for hashtags and usertags
- Validating and concatenating external metadata
- Meta collections - where to make configuration changes
- How to configure a collection to crawl a WebDAV service
- JSON customer gatherer
- Meta collections - Incompatible indexes
- Troubleshooting XML
- Alternative approaches for indexing database-like content
- Character encoding: web crawler, WARC file
- Character encoding: validate content source
- Character encoding: filters
- Character encoding: Index file
- Character encoding: Indexer
- Debug groovy filter
- Inspect the network traffic between you and your connection target
- Create a jsoup filter
- TRIMPush collection recommended server settings
- Enable DEBUG logging for the TRIMPush collection
- Yammer
- Custom collections - adding custom jar files
- Custom collections - updating admin interface status messages
- Custom gatherers - handling an update stop
- Using tnsnames.ora with Oracle database collections
- Funnelback robots support
- Geocoding Funnelback results
- Clear collection update locks
- Map TRIM IDs to TRIM numbers
- Index SQLite database
- Oracle Right Now
- Debug document level security (DLS)
- SugarCRM
- ManifoldCF