performancelucene.net

Slow Lucene.Net search performance


Facing slow search performance using Lucene.Net (+ NHibernate.Search but that doesn't matter).

Luke toolbox overview:

Index directory is ~200Mb large.

Query (using org.apache.lucene.analysis.SimpleAnalyzer)

Title:lapsa~0.5 Abstract:lapsa~0.5 Content:lapsa~0.5 Location:lapsa~0.5 Author:lapsa~0.5

takes ~60000ms in average.


I suspect I'm missing something important. Any ideas what's wrong? Can't be that this is normal.


Tried to 'check' and 'fix' them. Had to tick Don't open IndexReader (when opening corrupted index), otherwise Check index tool doesn't want to show up.

Results of checking:

BAD: missingSegments

Diagnostic output:

ERROR: could not read any segments file in directory java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@D:\Temp\Index: files: at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:655) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:538) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:306) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:340) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319) at org.getopt.luke.Luke$6.run(Unknown Source)

Tried to press Fix Index. Got this:

ERROR during Fix Index: java.lang.NullPointerException at org.apache.lucene.index.CheckIndex.fixIndex(CheckIndex.java:781) at org.getopt.luke.Luke$7.run(Unknown Source)


Solution

  • Sounds to me like you've got a corrupted index. Are there any files in your D:\Temp\Index folder? I assume there must be or searching wouldn't work at all... What version of Lucene.Net are you using? Earlier versions used to corrupt the index for me at the drop of a hat, but the later versions seem to be much better in that respect.

    If you can't figure it out, you might just have to rebuild the index from scratch.