solrlucenechef-infraknife

Which special characters need escaping in a solr query?


Update: I think this question has to do with solr syntax in general, and not Chef in particular. So while I ran into this working with Chef, I presume that anyone working with Solr will also experience this...


I'm working on an application that communicates with the Chef server's search API to find particular nodes.

Based on this http://docs.opscode.com/essentials_search.html#special-characters, it seems that a number of special characters need to be escaped.

Note: I'm only concerned with exact-matching patterns, not wildcards. I realize that the reason some of these characters are wildcards.

Here's the list at the time of this writing, as copied from the URL above:

+  -  &&  | |  !  ( )  { }  [ ]  ^  "  ~  *  ?  :  \

When I try various knife search commands with these characters, however, I see inconsistent behaviour.

For the following examples, I set up a node that is tagged with +&|!(){}[]^\"~*?:\\"

These commands were run from a Linux box, in a bash shell:

$ knife search node 'tags:+&|!(){}[]^"~*?:\'
ERROR: knife search failed: invalid search query: 'tags:+&|!(){}[]^"~*?:\'

That behaved as expected, since nothing was escaped. Now, I escape everything with a single \ as the docs suggest:

$ knife search node 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\'
ERROR: knife search failed: invalid search query: 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\'

Strange.

Can anyone shed some light on this, and maybe suggest a query that's capable of matching that tag?

It's obviously unlikely that anyone will ever have an attribute containing all those special characters, but I'd like to understand better how the special characters should be escaped.

Thanks!


Solution

  • You need to use the lucene solr syntax for regexes: http://lucene.apache.org/core/6_5_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters