coldfusionlucenecfsearch

cfsearch - Error executing query : org.apache.lucene.queryParser.ParseException: Cannot parse : Lexical error


I've got a basic cfsearch that works fine, but occasionally it can be broken with search strings like the following;

Any of the above will result in an error like;

Error executing query : org.apache.lucene.queryParser.ParseException: Cannot parse '"my search string': Lexical error at line 1, column 32. Encountered: after : "\"my search string"

I was thinking I could strip out those characters, but you might have a working search term with, say, two "" - ie. "my search string" - which is valid. Is there a preferable way to prepare a string for cfsearch?

So, in the example of:

"my search string

it would strip out the first ". But if the search term was:

"my search string"

all good - leave it alone. Any ideas?! Are there any other characters that can cause an error? For example, a hacker tried this;

XyOk,'.](.]]]'

Which caused an error.


Solution

  • Use the VerityClean UDF from CFLib to sanitize the Verity/Lucene search parameter. (NOTE: Add :, ^ and * to the pipe-delimited reBadChars variable so they will be stripped for Lucene.)

    http://www.cflib.org/udf/verityClean