elasticsearchschemalesselasticsearch-mapping

Store mixed data-type in ElasticSearch


I am using logstash to manage my application logs. I want to store some context data along with the log entries. These context data doesn't have to be indexed. But it can have can have different structure/data type depending on the application context. For example, the context can be in any of the following formats

String

{
    error: "This is a sample error message"
}

Array

{
    error: [
        "This is an error message", 
        "This is another message", 
        "This is the final message"
    ]
}

Or it could be an object

{
    error: {
        user_name: "Username cannot be empty",
        user_email: "Email address is already in use",
        user_password: "Passwords do not match"
    }
}

Is it possible to have such a field in ElasticSearch? The field does not have to be indexed, it just needs to be stored.


Solution

  • I don't think it's possible to do exactly what you're asking. You can get the first two examples for free, though, since any field can be a list:

    curl -XDELETE "http://localhost:9200/test_index"
    
    curl -XPUT "http://localhost:9200/test_index" -d'
    {
        "mappings": {
            "doc": {
                "properties": {
                    "error": {
                        "type": "string",
                        "index": "not_analyzed"
                    }
                }
            }
        }
    }'
    
    curl -XPUT "http://localhost:9200/test_index/doc/1" -d'
    {
        "error": "This is a sample error message"
    }'
    
    curl -XPUT "http://localhost:9200/test_index/doc/2" -d'
    {
        "error": [
            "This is an error message", 
            "This is another message", 
            "This is the final message"
        ]
    }'
    
    curl -XPOST "http://localhost:9200/test_index/_search"
    ...
    {
       "took": 2,
       "timed_out": false,
       "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
       },
       "hits": {
          "total": 2,
          "max_score": 1,
          "hits": [
             {
                "_index": "test_index",
                "_type": "doc",
                "_id": "1",
                "_score": 1,
                "_source": {
                   "error": "This is a sample error message"
                }
             },
             {
                "_index": "test_index",
                "_type": "doc",
                "_id": "2",
                "_score": 1,
                "_source": {
                   "error": [
                      "This is an error message",
                      "This is another message",
                      "This is the final message"
                   ]
                }
             }
          ]
       }
    }
    

    Alternatively, you could set up the mapping according to your third example, and then just use the fields needed for each document (complicating your application code, presumably):

    curl -XDELETE "http://localhost:9200/test_index"
    
    curl -XPUT "http://localhost:9200/test_index"
    
    curl -XPUT "http://localhost:9200/test_index/doc/3" -d'
    {
        "error": {
            "user_name": "Username cannot be empty",
            "user_email": "Email address is already in use",
            "user_password": "Passwords do not match"
        }
    }'
    
    curl -XGET "http://localhost:9200/test_index/_mapping"
    ...
    {
       "test_index": {
          "mappings": {
             "doc": {
                "properties": {
                   "error": {
                      "properties": {
                         "user_email": {
                            "type": "string"
                         },
                         "user_name": {
                            "type": "string"
                         },
                         "user_password": {
                            "type": "string"
                         }
                      }
                   }
                }
             }
          }
       }
    }
    

    So basically the direct answer to your question is "No", unless I'm missing something (which is quite possible).

    Here is the code I used:

    http://sense.qbox.io/gist/18476aa6c2ad2fa554b472d09934559c884bec33