Index tools for URL / document number conversion

Managed by | Updated .

What are these for?

padre includes a number of tools that can be used to look up and convert between URLs and document numbers within Funnelback indexes.

The URL is the document's URL as stored within the Funnelback index - this will depend on what URL was used to access the URL after any redirects have been processed.

The document number identifies the document within the Funnelback index - this number will change from update to update and depends on the order in which documents are stored.

Get a document number from a URL (get_docnum_from_url)

  • Prints the docnum for a given URL to standard out.
  • Prints "notfound" if the URL is not found.

Usage

$SEARCH_HOME/bin/get_docnum_from_url <index_stem> <url>

Parameters: * index_stem: the common prefix (including path) of the index files * url: the URL to look up the document number for

Get a url from document number (get_url_from_docnum)

Usage

$SEARCH_HOME/bin/get_url_from_docnum <index_stem> <doc_num>

Parameters:

  • index_stem: the common prefix (including path) of the index files
  • doc_num: the document number to look up the URL for

Get url from component document pair (get_url_from_component_document_pair)

  • For meta collections - doesn't work on non-meta collections

Usage

$SEARCH_HOME/bin/get_url_from_component_document_pair <index_stem> <component_number> <doc_number>

Parameters:

  • index_stem: the common prefix (including path) of the meta collection index files
  • component_number: the component number (from the index.sdinfo file) of the appropriate component index
  • doc_num: the document number to look up the URL for
Was this artcle helpful?

Comments