I will like to know how the thesaurus dictionaries are built. What is the relation between .dat file and index file .idx? For e.g. the relevant entry from th_en_CA_v2.dat file looks like this...
ploy|2
(noun)|gambit|remark (generic term)|comment (generic term)
(noun)|gambit|stratagem|maneuver (generic term)|manoeuvre (generic term)|tactical maneuver (generic term)|tactical manoeuvre (generic term)
The relevant entry from th_en_CA_v2.idx file
ploy|12626348
What is that number (12626348) next to word ploy?
It's the byte offset of the entry for ploy
in the .dat
file.