lucenepylucene

Lucene QueryParser interprets 'AND OR' as a command?


I am calling Lucene using the following code (PyLucene, to be precise):

analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))

But consider if this is the content of querytext:

querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"

In that case, the "AND OR" trips up the queryparser, even though I am use queryparser.escape. How do I avoid the following error message?

    Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
    <NOT> ...
    "+" ...
    "-" ...
    "(" ...
    "*" ...
    <QUOTED> ...
    <TERM> ...
    <PREFIXTERM> ...
    <WILDTERM> ...
    "[" ...
    "{" ...
    <NUMBER> ...
    <TERM> ...
    "*" ...

 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
     ....
 at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
 at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
 at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
 at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
 at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)

Solution

  • It's not just OR, it's AND OR.

    I use the following workaround:

    query = queryparser.parse(queryparser.escape(querytext.replace("AND OR", "AND or")))