Padre binaries: padre-di (Padre document information)
Managed by | Updated .
padre-di
The purpose of padre-di is to display information relating to documents within an index.
You can use it to:
- Spot strange documents in the document table
- Print metadata information associated with a document
Usage
Check mode
Usage: $ padre-di /opt/funnelback/.../live/idx/index -check
No. docs: 868
0: [884, 127, 0, 62] ...lear.1.2.html
1: [871, 127, 0, 62] ...lear.2.2.html
2: [837, 127, 0, 62] ...lear.2.1.html
3: [717, 127, 0, 62] ...lear.4.3.html
4: [620, 127, 0, 62] ...lear.3.3.html
5: [555, 127, 0, 62] ...lear.5.2.html
6: [819, 127, 0, 62] ...lear.3.7.html
7: [890, 127, 0, 62] ...lear.3.4.html
8: [817, 127, 0, 62] ...lear.3.6.html
9: [981, 100, 0, 62] ...lear.5.3.html
Checking complete.
Content lengths: 446 - 1333 (not meaningful)
Onsite indegree: 63 - 127
Offsite indegree: 0 - 0
URL length (chars): 44 - 87
Notes:
- The four bracketed values are [Content lengths,Onsite indegree,Offsite indegree,URL length (chars)]
- The last three only go up to 127, due to an optimisation
- Documents 0-9 are always printed
- Also, documents with no content, or no incoming links, or no url length are printed
- This check doesn't 'fail' like padre-cw can
Metadata inspection mode (URLs)
Displays the indexed metadata by URL for each document in the index.
Usage: $ padre-di /opt/funnelback/.../live/idx/index -meta
No. docs: 868
[86] test-data.funnelback.com/Shakespeare/
t: The Complete Works of William Shakespeare |Comedy|History|Tragedy|Poetry
[585] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.1.html
d: 1597-01-01|1590-01-01
t: SCENE I. London. The palace. |SCENE I. London. The palace.
[574] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.2.html
d: 1597-01-01|1590-01-01
t: SCENE II. London. An apartment of the Prince's. |SCENE II. London. An apartment of the Prince's.
[579] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.3.html
d: 1597-01-01|1590-01-01
....
Notes:
- Sorted lexicographically by the document URL
- Prints numeric metadata as strings (only important if they're different)
Metadata inspection mode (document numbers)
Displays the indexed metadata by document number for each document in the index.
Usage: $ padre-di /opt/funnelback/.../live/idx/index -metad
No. docs: 868
[86] test-data.funnelback.com/Shakespeare/
[86] t: The Complete Works of William Shakespeare |Comedy|History|Tragedy|Poetry
[585] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.1.html
[585] d: 1597-01-01|1590-01-01
[585] t: SCENE I. London. The palace. |SCENE I. London. The palace.
[574] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.2.html
[574] d: 1597-01-01|1590-01-01
[574] t: SCENE II. London. An apartment of the Prince's. |SCENE II. London.
An apartment of the Prince's.
[579] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.3.html
[579] d: 1597-01-01|1590-01-01
....
$ padre-di /opt/funnelback/.../live/idx/index -metad iv.1.1
No. docs: 868
[585] test-data.funnelback.com/Shakespeare/1henryiv/1henryiv.1.1.html
[585] d: 1597-01-01|1590-01-01
[585] t: SCENE I. London. The palace. |SCENE I. London. The palace.
[69] test-data.funnelback.com/Shakespeare/2henryiv/2henryiv.1.1.html
[69] d: 1598-01-01
[69] t: SCENE I. The same. |SCENE I. The same.
Field #docs #chars
d: 2 31
t: 2 96
Notes:
- -metad is the same as -meta, but adds the docid to every line
- You can supply a pattern after -meta or -metad
- The pattern is an exact substring match
- Every -meta mode has the summary at the end
Was this artcle helpful?