I have been tasked to implement a search functionality on a website containing roughly 300 dynamic pages. Using Zend Search Lucene (a lifesaver), I have accomplished that. But now comes an issue with moving everything into production. The website is hosted on a shared server with a maximum execution time as 30 seconds, which is a quarter of the time that is needed to run my indexing script.
The indexing script is split into the following set of steps: (1) create all the documents (2) add these documents to the index and finally (3) commit. From what I understood, once you commit, the index will be overwritten with the new documents.
So, my question is the following: is it possible to commit to the index without overwriting everything? For example, I would like to run 4 scripts separately one after the other. Each script will do the same set of steps but only for a specific set of documents. This would allow each script to remain within the 30-seconds execution time. At the end, the index will have all the documents.
If this is not possible, what would be an alternative solution?
Yes, you can update (actually delete & re-add) individual documents in a Lucene index. You will need a unique, permanent ID for each document. After an update, you will need to open a new IndexReader to pick up the updated documents.