indexingluceneinformation-retrieval

Why not use indexwriter to build index in Lucene?


I didn't find any lucene forums, so this was the only relevant place to ask this question. Hope for the best.

These are the steps of indexing in Lucene given in our syllabus-:

Figure1

Figure2

I understand the second step clearly. But I don't understand the first and third step. It's not mentioned clearly in this figure imo.

Can you clear my confusion?

Plus the sources that I refer don't even mention it like this, they explain it differently. I'm not sure from where this is copied from.

What are we doing in first vs third step as written in that figure text?

Why was indexwriter created first and not used later? Because according to my information that I've collected, you can also use indexwriter to add/remove/update indexes. So, we could just use it for the purpose. What're they doing in that figure?

This information is originally written by a no-name person so I can't ask anyone.


Solution

  • "the sources that I refer don't even mention it like this"

    There can be more than one way to do things in Lucene. For example. the official documentation includes a basic demo which uses an IndexWriterConfig instead. See line 129 of the indexer demo source code.


    "Why was indexwriter created first and not used later?"

    It looks as if there is something left unexplained, or explained elsewhere: The final step , which can be:

    1. Something like indexWriter.addDocument(doc); to add a document to a newly created index. See line 271 of the above mentioned demo.

    2. Something like indexWriter.updateDocument([term goes here], doc); to update an existing document. based on an identifier for the specific doc (the "term"). See line 277 of the above mentioned demo.

    Either way, now you see the document you just created being added to the index using the index writer you previously created.


    Give it a try. If you get stuck, you can ask a specific question - but the chances are it may have already been asked and answered here.