Padre binaries: padre-cw (Padre check words)

Managed by | Updated .

padre-cw

Useful for:

  • Comparing indexes Showing which docids match a term
  • Checking indexes

Usage

Print version information

Usage: ./padre-cw -v

Compare two indexes

Usage1: ./padre-cw stem1 stem2 [-io]

-io means ignore diffs in offsets into .idx

$ ./padre-cw coll1/live/idx/index/index coll2/live/idx/index
Default compression scheme for this version is ELIAS
Data Structures written by FUNNELBACK_PADRE_10.1.1.54
doctable_fieldwidths: 47 5 3 5 5 4 2 5 5 3 1 1 8 0
./padre-cw: FUNNELBACK_PADRE_11.0.0.29
Index name: coll1/live/idx/index/index
Sorted: No
Position info: Yes
Case info: No
Vbyte Compression: Yes
Dictionary(coll1/live/idx/index/index): 29214 words, 255417 chars
Index (coll1/live/idx/index/index): 2340374 u_ints
./padre-cw: Comparing coll1/live/idx/index/index with coll2/live/idx/index/index
Dictionary(coll2/live/idx/index/index): 29214 words, 255417 chars
Index (coll2/live/idx/index/index): 2340374 u_ints
Comparing 29214 dictionary entries
A total of 0 differences encountered.

Show postings

Usage: ./padre-cw stem1 -show term

Show postings for term

Also shows term before and afterward. (if applicable)

$ ./padre-cw /opt/funnelback/...../index -show lear
Default compression scheme for this version is ELIAS
Data Structures written by FUNNELBACK_PADRE_11.0.0.2
doctable_fieldwidths: 47 5 3 5 5 4 2 5 5 3 1 1 8 0
./padre-cw: FUNNELBACK_PADRE_11.0.0.29
Index name: /opt/funnelback/data/Se2-ContentOptimiser/live/idx/index
Sorted: No
Position info: Yes
Case info: No
Vbyte Compression: Yes
Dictionary(/opt/funnelback/data/Se2-ContentOptimiser/live/idx/index): 29214 words, 255417 chars
Index (/opt/funnelback/data/Se2-ContentOptimiser/live/idx/index): 2340374 u_ints
Showing lear entries in /opt/funnelback/data/Se2-ContentOptimiser/live/idx/index.idx
Dictionary entry 14702, freq = 985, idx_off = 1080364
had the following entries (docnum, wordpos):
0 ( 0, 12) ( 0, 43) ( 0, 1652) ( 1, 10) ( 1, 41)
5 ( 1, 1523) ( 2, 9) ( 2 40) ( 2, 1218) ( 3, 11)
....

Check index files

Usage: ./padre-cw stem -check

Checks the index files for index stem

$ ./padre-cw /opt/..../index -check
Default compression scheme for this version is ELIAS
Data Structures written by FUNNELBACK_PADRE_10.1.1.54
doctable_fieldwidths: 47 5 3 5 5 4 2 5 5 3 1 1 8 0
./padre-cw: FUNNELBACK_PADRE_11.0.0.29
Index name: /opt/funnelback/data/test-shakespeare/live/idx/index
Sorted: No
Position info: Yes
Case info: No
Vbyte Compression: Yes
Dictionary(/opt/..../index): 29214 words, 255417 chars
Index (/opt/..../index): 2340374 u_ints
Doc table (/opt/..../index): 868 docs, 55664 chars in urls.
/opt/..../index.bnams: 66965 chars
./padre-cw: Checking doc table
Bad dates encountered: 0
Doctable test PASSED.
./padre-cw: Checking 29214 entries in /opt/..../index (868 docs)
0 0 h(str = 0, freq = 53, offset = 0, df = 25)
1 0 v(str = 4, freq = 14, offset = 32, df = 13)
2 01 d(str = 8, freq = 1690, offset = 40, df = 794)
....
0 errors encountered in scan of dct and idx. Maxdoc = 867(0)
Dictionary/index test PASSED.
*** ALL TESTS PASSED ***
Was this artcle helpful?

Comments