Best practices for meta collections

Managed by | Updated .

Share configuration files

It is recommended to share metamap.cfg and gscopes.cfg across all component collections within a meta collection.

Having the same metamap and gscopes configuration applied to all component collections helps to avoid unforeseen overlap in definitions (eg. assigning author metadata to a class in one collection and subject metadata to the same class in another collection).  This is important because when the collections are combined within a meta collection all the values in each metadata class (or gscope number) become combined in the meta collection.  This also aids in the setup of faceted navigation.

To share configuration files, symlinks can be used:

ln -s $SEARCH_HOME/conf/<component-collection>/metamap.cfg $SEARCH_HOME/conf/<meta-collection>/metamap.cfg
ln -s $SEARCH_HOME/conf/<component-collection>/gscopes.cfg $SEARCH_HOME/conf/<meta-collection>/gscopes.cfg

A drawback of symlinks is that if the collection containing the original file gets deleted, the symlinks will become invalid. An alternative is to use hard links instead.

Prevent access to component collections

For collections that are not supposed to be queried directly, prevent access with:

access_restriction=127.0.0.1 (And possibly any other internal range to permit monitoring)
access_alternate=<meta-collection-id>

Disable analytics updates on meta components

Ensure analytics.scheduled_database_update=false is set on component collections that don't get accessed directly. That will prevent Funnelback to waste time and resources building an unused analytics database.

Avoid duplicating template files for different components

If multiple components of a meta collection can be queried independently and have a similar look-and-feel, avoid duplicating template files between those components. Duplicated template files make the code hard to read and to maintain, since a single change needs to be manually replicated to all the components with the duplicated form file (time consuming and error-prone).

To avoid repetition, build the common form parts in a macro library and re-use the macros across all the components. Alternatively, use FreeMarker's <#include> to include a common form file into your collection or profile.

See also: sharing templates across profiles

Use a dedicated profile for system generated queries

It's frequent in a meta-collection setup to have system generated queries (either extra searches, or Ajax searches). Those system generated queries often require a set of specific query processor options. Additionally, it might be desirable to not log those queries in the main query log to not pollute analytics.

It's recommended to use a dedicated profile for such queries. A dedicated profile has the following benefits:

  • A padre_opts.cfg file can be setup to apply specific query processor options, rather than having to dynamically inject them with a hook script (simpler, less maintenance)
  • The query processor options can include the -nolog=true option to prevent queries to the profile to be logged
  • If the -nolog=true options cannot be used, at least system generated queries will be confined to a single profile

Update workflow for meta collections


While meta collections have a workflow, they are not updated like other collections. They get updated when their configuration is saved in the Admin UI, or when a component collection updates ("meta dependencies" workflow). Because of this there are some caveats to their workflow.

Faceted navigation

The faceted navigation configuration is stored in the configuration folder for the collection: $SEARCH_HOME/conf/<collection>/faceted_navigation.cfg . For this configuration to be effective, it needs to be either:

  • Copied to $SEARCH_HOME/data/<collection>/live/idx/faceted_navigation.xml , via a post_index workflow command
  • Or faceted_navigation.config.location=conf needs to be set

The former ensures that the configuration file will be copied when the collection updates (on save, or when the meta dependencies runs). The latter makes the Public UI layer use the faceted navigation configuration file from the configuration folder of the collection, rather than the one in the live index. In both cases, it means that previewing of faceted navigation changes is not possible, as the changes will become immediately effective.

Meta dependencies

Consider disabling the meta dependencies phase on all but one component collection in a meta collection. Meta dependencies is responsible for generating spelling and query completion for the meta collection.  This can take a long time for meta collections with large indexes, and can also lock indexes on component collections while the meta dependencies phase is running (which can cause problems with updates on other components).

It is probably acceptable for the meta collection's spelling/query completion to be updated once a day rather than on every component collection update, so linking it to the update of the most important component can improve of the efficiency of the overall update performance.  The collection.cfg setting to disable the meta dependencies phase is:

meta_dependencies=false

Additional Resources

Was this artcle helpful?

Tags
Type:
Features:

Comments