pythonelasticsearchkibanaelasticsearch-analyzers

Python Elasticsearch: Errors when trying to apply an analyzer to Index documents


So I'm trying to apply an analyzer to my index but no matter what I do I get some sort of error. I've been looking stuff up all day but can't get it to work. If I run it as it is below, I get an error which says

elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'analyzer [{settings={analysis={analyzer={filter=[lowercase], type=custom, tokenizer=keyword}}}}] has not been configured in mappings')

if I add a "mappings" below the body= part of the code and above the "properties" part, I get this error

elasticsearch.exceptions.RequestError: RequestError(400, 'mapper_parsing_exception', 'Root mapping definition has unsupported parameters: [mappings : {properties={Name={analyzer={settings={analysis={analyzer={filter=[lowercase], type=custom, tokenizer=keyword}}}} (and it'll go through every name in the body part of the code)

def text_normalization():
    normalization_analyzer = {
        "settings": {
            "analysis": {
                "analyzer": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "filter": ["lowercase"]
                }
            }
        }
    }

    elasticsearch.indices.put_mapping(
        index=index_name,
        body={
            "properties": {
                "Year of Birth": {
                    "type": "integer",
                },
                "Name": {
                    "type": "text",
                    "analyzer": normalization_analyzer
                },
                "Status": {
                    "type": "text",
                    "analyzer": normalization_analyzer
                },
                "Country": {
                    "type": "text",
                    "analyzer": normalization_analyzer
                },
                "Blood Type": {
                    "type": "text",
                    "analyzer": normalization_analyzer
                }
            }
        }
    )

    match_docments = elasticsearch.search(index=index_name, body={"query": {"match_all": {}}})
    print(match_docments)

Any help would be appreciated.


Solution

  • Your analyzer is simply missing a name, you should specify it like this:

    normalization_analyzer = {
        "settings": {
            "analysis": {
                "analyzer": {
                    "normalization_analyzer": {                <--- add this
                        "type": "custom",
                        "tokenizer": "keyword",
                        "filter": ["lowercase"]
                    }
                }
            }
        }
    }
    

    You need to install this analyzer using

    elasticsearch.indices.put_settings(...)
    

    Also in the mappings section, you need to reference the analyzer by name, so you simply need to add the analyzer name as a string

        body={
            "properties": {
                "Year of Birth": {
                    "type": "integer",
                },
                "Name": {
                    "type": "text",
                    "analyzer": "normalization_analyzer"
                },
                "Status": {
                    "type": "text",
                    "analyzer": "normalization_analyzer"
                },
                "Country": {
                    "type": "text",
                    "analyzer": "normalization_analyzer"
                },
                "Blood Type": {
                    "type": "text",
                    "analyzer": "normalization_analyzer"
                }
            }
        }