FTP sites
Managed by | Updated .
Background
Funnelback includes basic support for the crawling of FTP sites using the web crawler.
Note:
- Crawling of FTP sites does not support DLS.
Method
Create a web collection for the ftp site index
- Configure the start URL to be the FTP site's root page
- Configure include/exclude patterns as for a standard web collection
- Enable the ftp protocol by adding ftp to the
crawler_protocols
collection.cfg
setting.
Configure authentication
Set the ftp username and password configuration options in
collection.cfg
:ftp_passwd=<FTP USERNAME> ftp_user=<FTP PASSWORD>
Configure filetypes and download sizes
- The basic filetypes supported by web collections will be gathered. Additional filetypes can be added. See: Configure Funnelback to index additional file types
- Set download and parser sizes using the
crawler.max_download_size
andcrawler.max_parse_size
settings.
Crawl the site.
Was this artcle helpful?