Given below is my Elasticsearch document on which I want to fire an aggregation query.
{
"id": 1,
"attributes": [
{
"fieldId": 1,
"value": "Male"
},
{
"fieldId": 2,
"value": "12/11/2015"
}
]
}
{
"id": 2,
"attributes": [
{
"fieldId": 1,
"value": "Male"
},
{
"fieldId": 2,
"value": "11/11/2015"
}
]
}
The result has to be as follows.
[
{
"key": "Male",
"doc_count": 1
}
]
[
{
"key": "12/11/2015",
"doc_count": 1
},
{
"key": "11/11/2015",
"doc_count": 1
}
]
Is there a way that this can be achieved in Elasticsearch?
That's possible. See this example:
We have to map attributes as nested
type to be able to aggregate properly.
PUT /test
{
"mappings": {
"sample": {
"properties": {
"id": {
"type": "integer"
},
"attributes": {
"type": "nested",
"properties": {
"fieldId": {
"type": "integer"
},
"value": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
Let's add your given test data:
PUT /test/sample/1
{"id":1,"attributes":[{"fieldId":1,"value":"Male"},{"fieldId":2,"value":"12/11/2015"}]}
PUT /test/sample/2
{"id":2,"attributes":[{"fieldId":1,"value":"Male"},{"fieldId":2,"value":"11/11/2015"}]}
And finally let's run this query:
GET /test/_search
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"Nest": {
"nested": {
"path": "attributes"
},
"aggs": {
"fieldIds": {
"terms": {
"field": "attributes.fieldId",
"size": 0
},
"aggs": {
"values": {
"terms": {
"field": "attributes.value",
"size": 0
}
}
}
}
}
}
}
}
What will it do?
nested
aggregation first in order to get into nested
objects and aggregate them properly.terms
aggregation for each fieldId
, in your case we'll get two of them: 1
and 2
.terms
aggregation again for each of buckets above in order to get coresponding values.So that's the output.
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"Nest": {
"doc_count": 4,
"fieldIds": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 1,
"doc_count": 2,
"values": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Male",
"doc_count": 2
}
]
}
},
{
"key": 2,
"doc_count": 2,
"values": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "11/11/2015",
"doc_count": 1
},
{
"key": "12/11/2015",
"doc_count": 1
}
]
}
}
]
}
}
}
}
It's not precisely what you've requested. But that's closest what you can get in Elasticsearch.