lucenelucene-highlighter

Lucene - Highlighter throwing exception when using * on search


I'm using Lucene 4.6.1 and Highlighter 4.6.0. Since indexing is working properly, I'm just gonna show my search code:

    ... code to get all the fields' name/values, numDocs, etc.
    ...
    // Create Query and search 

    try {
        TopScoreDocCollector collector = TopScoreDocCollector.create(numDocs, true);
        Query q = MultiFieldQueryParser.parse(Version.LUCENE_40, searchTerms, fields, analyzer);
        searcher.search(q, collector);
        ScoreDoc[] hits = collector.topDocs().scoreDocs;
        Highlighter highlighter = new Highlighter(new QueryScorer(q));
        highlighter.setTextFragmenter(new SimpleFragmenter(40));
        int maxNumFragmentsRequired = 2;

        System.out.println("Found " + hits.length + " hits.");
        for(int i=0;i<hits.length;++i) {
            int docId = hits[i].doc;
            Document d = searcher.doc(docId);
            for(int j=0; j<fields.length; j++) {
                if(d.get(fields[j]) != null) {
                    String fieldText = d.get(fields[j]).trim();
                    TokenStream tokenStream = analyzer.tokenStream(fields[j], new StringReader(fieldText));

                    // Create String without the highlighted term
                    String unhighlighted = (i + 1) + ". "+fields[j]+ " "+ d.get(fields[j]).trim() + "<br>";

                    // Create the highlighted term
                    String highlighted = highlighter.getBestFragments(tokenStream, fieldText, maxNumFragmentsRequired, "...");

                    // If the highlighted term really exists
                    if(!highlighted.equals("")) 
                        unhighlighted = (i + 1) + ". "+fields[j]+ " "+ highlighted + "<br>";

                    response += unhighlighted;
                }
            }
        }

    } catch (Exception e) {
        System.out.println("Error searching " + searchTerm + " : " + e.getMessage());
    }

    System.out.println(response);
}

For example: into my index I got many Documents named "Process 001", "Process 002", "Process 003" and so on. If I try to search by: Process, I can retrieve all the Process (this is working perfect!). The problem happens when I try to search by: proc*, or: pr*, or something like that... The errors are here:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/lucene/queries/CommonTermsQuery
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:149)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:99)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:474)
at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:217)
at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:186)
at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:197)
at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:156)
at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:460)
at freedom.lucene.service.LuceneTestApplication.search(LuceneTestApplication.java:406)
at freedom.lucene.service.LuceneTestApplication.main(LuceneTestApplication.java:75)
Caused by: java.lang.ClassNotFoundException: org.apache.lucene.queries.CommonTermsQuery
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 10 more

The exception occurs on this line:

String highlighted = highlighter.getBestFragments(tokenStream, fieldText, maxNumFragmentsRequired, "...");

If I remove the Highlighter code, the search works properly with *


Solution

  • Add lucene-queries-4.6.1.jar to your classpath.

    CommonTermsQuery is not included in the lucene-core jar.