searchsolrlucenen-gram

Can I protect short words from an n-gram filter in Solr?


I have seen this question about searching for short words in Solr. I am wondering if there is another possible solution to a similar problem. I am using the EdgeNGramFilter with a minGramSize of 3. I want to protect a specific set of shorter words (two-letter acronyms, mainly) from being ignored, but I'd like to keep that minGramSize of 3 for everything else. EdgeNGramFilter doesn't support a protected words list. Is there any filter or setting that makes this possible within a single field type, or will I need to write one?

Or, am I thinking about this the wrong way?


Solution

  • Thought hard about this one, but the answer in the other question you mention seems to be the only way. This will be a useful feature for the EdgeNGramFilter though.

    For now, you can keep a copy field and a KeepWordFilterFactory for it with only the acronyms you need. Or if your list of acronyms is not know a priori, use a LengthFilter.