grailsdiacriticscompass-lucenegrails-searchable

Accent insensitive search in Grails


How to make full text search using Grails Searchable Plugin accent insensitive ?


Solution

  • I have solved this problem with help of Peter Ledbrook's post, however some effort was needed:

    Since latest searchable plugin uses Lucene 2.4.1 which does not contain ASCIIFoldingFilter (available since 2.9.0) and ISOLatin1AccentFilter doesn't support many languages I have created custom filter for stripping accents:

    
    
        import java.text.Normalizer
        import org.apache.lucene.analysis.Token
        import org.apache.lucene.analysis.TokenFilter
        import org.apache.lucene.analysis.TokenStream
    
        class StripAccentsFilter extends TokenFilter {
    
            StripAccentsFilter(TokenStream input)   {
                super(input)
            }
    
            public final Token next(Token reusableToken) {
    
                assert reusableToken
    
                Token nextToken = input.next(reusableToken)
                if (nextToken) {
                    nextToken.setTermBuffer(Normalizer.normalize(nextToken.termBuffer() as String, Normalizer.Form.NFD)
                            .replaceAll("\\p{InCombiningDiacriticalMarks}+", ""))
                    return nextToken
                }
                return null
            }
        }
    
    

    and corresponding filter provider:

    
    
        import org.apache.lucene.analysis.TokenStream
        import org.compass.core.config.CompassSettings
        import org.compass.core.lucene.engine.analyzer.LuceneAnalyzerTokenFilterProvider
    
        class StripAccentsFilterProvider implements LuceneAnalyzerTokenFilterProvider {
    
            public void configure(CompassSettings paramCompassSettings) {
            }
    
            public TokenStream createTokenFilter(TokenStream paramTokenStream) {
                return new StripAccentsFilter(paramTokenStream)
            }
    
        }
    
    

    Now all you need to do is to register this filter provider in configuration of searchable plugin (grails-app/conf/Searchable.groovy):

    compassSettings = [
        'compass.engine.analyzer.default.filters': 'stripAccents',
        'compass.engine.analyzer.search.filters': 'stripAccents',
        'compass.engine.analyzerfilter.stripAccents.type': 'StripAccentsFilterProvider' 
    ]