elasticsearchelasticsearch-mapping

Index email with ElasticSearch - mapping problem


I use ES v7. I want to index email address with ElasticSearch but using uax_url_email tokenizer. I want to search Elastic with full email address.

I tried use this mapping:

PUT /test
{
  "settings": {
    "analysis": {
      "filter": {
        "email": {
          "type": "pattern_capture",
          "preserve_original": 1,
          "patterns": [
            "([^@]+)",
            "(\\p{L}+)",
            "(\\d+)",
            "@(.+)",
            "([^-@]+)"
          ]
        }
      },
      "analyzer": {
        "email": {
          "tokenizer": "uax_url_email",
          "filter": [
            "email",
            "lowercase",
            "unique"
          ]
        }
      }
    }
  },
  "mappings": {
    "emails": {
      "properties": {
        "email": {
          "type": "string",
          "analyzer": "email"
        }
      }
    }
  }
}

but get error

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Failed to parse value [1] as only [true] or [false] are allowed."
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Failed to parse value [1] as only [true] or [false] are allowed."
  },
  "status": 400
}

what is wrong with it ? How this mapping should look ?


Solution

  • Your request is malformed, you are passing 1 to preserve_original param which accepts only true and false as mentioned in the exception.

    Apart from this, there are few more issues, like you are using String data type which is deprecated in v7.1 and emails is coming before properties in your JSON.

    Correct mapping tested in my local would like

    {
        "settings": {
            "analysis": {
                "filter": {
                    "email": {
                        "type": "pattern_capture",
                        "preserve_original": true,
                        "patterns": [
                            "([^@]+)",
                            "(\\p{L}+)",
                            "(\\d+)",
                            "@(.+)",
                            "([^-@]+)"
                        ]
                    }
                },
                "analyzer": {
                    "email": {
                        "tokenizer": "uax_url_email",
                        "filter": [
                            "email",
                            "lowercase",
                            "unique"
                        ]
                    }
                }
            }
        },
        "mappings": {
            "properties": {
                "email": {
                    "type": "text",
                    "analyzer": "email"
                }
            }
        }
    }