Upweight or downweight metadata fields

Managed by | Updated .

Background

This article discusses how to upweight or downweight query terms that appear within specific metadata fields.

Process

Funnelback's ranking algorithm includes settings that control metadata weightings. This can be achieved by setting the sco and wmeta ranking options.

Scoring mode 2

Setting the -sco=2 ranking option allows specification of the metadata fields that will be considered as part of the ranking algorithm.

By default link text, clicked queries and titles are included (-sco=2[k,K,T]). The list of metadata fields to use with sco=2 is defined within square brackets when setting the value.

E.g. -sco=2[k,K,t,customField1,customField2] tells Funnelback to apply scoring to the default fields as well as customField1 and customField2.

Metadata weighting

Once scoring mode 2 is enabled separate weightings can be assigned to each defined field using a corresponding wmeta ranking option.

The weighting for each field is a value between 0.0 and 1.0. A weighting of 0.5 is the default and a value >0.5 will apply an upweight. A value <0.5 will apply a downweight.

E.g. -wmeta.t=0.6 applies a slight up-weighting to the t metadata field while -wmeta.customField1=0.2 applies a strong down-weighting to customField1.

Example

Assume that the following metadata is mapped for a collection (in the metamap.cfg):

description,1,dc.description
author,0,dc.author
section,0,site.section
datePublished,0,dc.date.published
dateModified,0,date.modified
articleText,1,article.content
articleTitle,1,article.title
articleKeywords,1,article.subjects
articleAbstract,1,articleAbstract 

The following ranking options (set as part of the queryprocessoroptions within collection.cfg) could be used to upweight the text within the articleTitle and articleAbstract metadata classes and downweight articleText

-sco=2[articleText,articleTitle,articleAbstract] -wmeta.articleText=0.3 -wmeta.articleAbstract=0.75 -wmeta.articleTitle=1.0

This tells Funnelback to apply special metadata weightings to the articleText, articleTitle and articleAbstract fields (the -sco=2 parameter) then apply a moderate downweight to articleText, a moderate upweight to articleAbstract and the maximum upweight to articleTitle.

Was this artcle helpful?

Tags
Type: Keywords:

Comments