pythonelasticsearchelasticsearch-py

Capital letters in field causing query to fail in elasticsearch


I have a query which searches for transcripts containing some value. It should match every document that belongs to a specific tenant. The query works when its term filter has no capital letters but fails when it does.

Should I encode my tenant ID to a format which uses only lowercase letters? Or is there a way to make the term filter (or some similar filter) case sensitive? Or do I need to enable some option on the tenant_id field to preserve its case?

{
    'query': {
        'bool': {
            'must': [{'match': {'transcript': 'hello, world!'}}],
            'filter': [
                {'term': {'tenant_id': 'XRqtv5O91WEEt'}}
            ]
        }
    }
}

Solution

  • Try with match instead of term:

    {
        'query': {
            'bool': {
                'must': [{'match': {'transcript': 'hello, world!'}}],
                'filter': [
                    {'match': {'tenant_id': 'XRqtv5O91WEEt'}}
                      ^^^^^
                ]
            }
        }
    }
    

    Alternatively, you can keep term but use the keyword subfield instead (if such a field exists in your mapping)

    {
        'query': {
            'bool': {
                'must': [{'match': {'transcript': 'hello, world!'}}],
                'filter': [
                    {'term': {'tenant_id.keyword': 'XRqtv5O91WEEt'}}
                                        ^^^^^^^^
                ]
            }
        }
    }