elasticsearchluceneskip-lists

What is lucene skip list for?


I'm studying lucene/elasticsearch internals, especially storage structures. When lucene gets "terms" to find "docID", I found it goes through TermIndex->TermDictionary->Frequency(.doc) (version 7.2) .doc has each term's posting list. .doc file also has skip list besides frequency data. In posting list, there are sorted docIDs.

My question is, what do they have skip list for? It seems what lucene search for is docID of a term. They should have some reason to have skip list and I don't know.


Solution

  • According to the lecture of Doug Cutter (founder of Lucene). pisa-dougcutter

    Documents are searched by DocID when each result of conditions on query are merged.