Padre binaries: build_autoc (Build auto-completion)
Managed by | Updated .
build_autoc
The build_autoc command can be used to generate query completion .autoc files for a collection.
This can be called to construct auto-completion from arbitrary sources. It's commonly used in post-swap or post-index workflow to produce structured auto-completion based off CSV data generated from the index.
Purpose
To build a query completion file (.autoc) from a list of input files.
Usage
/opt/funnelback/bin/build_autoc stem input_file ... [-collection name -profile name] [-partials] [-label_organics] [-debug]
-profile <name>
: generate scoped .autoc file for the specified profile. A previous run of build_autoc
must have been called with -index
.
-collection <name>
: generate scoped .autoc file for the specified collection. Both -profile
and -collection
need to be specified when generating scoped suggestions
-partials
: this version allows multi-word organic suggestions (from index.suggest
) to be triggered either from the full suggestions or from trailing word sequences. E.g. 'big fat cat' triggered from 'fat cat' and 'cat' as well as the full string. This option turns that on.
-label_organics
: present a category label for all the organic completions. Deprecated.
-sample <val>
: Sample postings of suggestion terms, to handle large collections, <val>
ranges 0 - 300; speeds up processing with the effect of sampling the suggestions. (1/val postings are used).
Note: /opt/funnelback/bin/build_autoc can now build a single .autoc file from multiple input files of the same or different types. Files with very simple format can be combined with hand-crafted files containing complex actions. Completion weights from a .suggest file are automatically determined, while they can be manually specified in a CSV file. Completion weights from Best Bets default to 100.
Example - blah.csv
will be sorted and indexed into index.autoc
. Input_file(s) must end in .csv
, .suggest
, or .cfg
.
/opt/funnelback/bin/build_autoc index blah.csv
Example - generate an auto-completion dataset based on the collection's words:
$SEARCH_HOME/bin/build_autoc $SEARCH_HOME/data/$COLLECTION_NAME/$CURRENT_VIEW/idx/index $SEARCH_HOME/data/$COLLECTION_NAME/$CURRENT_VIEW/idx/index.suggest
Example - generate an auto-completion dataset based on the two CSV files:
$SEARCH_HOME/bin/build_autoc $SEARCH_HOME/data/$COLLECTION_NAME/$CURRENT_VIEW/idx/index $SEARCH_HOME/conf/$COLLECTION_NAME/staff/auto-completion.csv $SEARCH_HOME/conf/$COLLECTION_NAME/courses/auto-completion.csv
Gotchas
When building auto-completion for a meta collection ensure you use the index.suggest file only from the meta collection and not from the component collections (otherwise you'll get duplicates in the suggestions).
Make sure the .autoc file that is built is copied to the meta collection's live/idx folder and that this file is pushed to any remote query processor servers.