elasticsearchelasticsearch-pluginspring-data-elasticsearchpyelasticsearchnosql

Elasticsearch Aggregation by Day of Week and Hour of Day


I have documents of type:

[{"msg":"hello", date: "some-date"},{"msg":"hi!", date: "some-date"}, ...

I want to have the count of documents by day of week. For example x messages were sent on Monday and y were sent on Tuesday and so on.

I have used date_histogram with aggregation but it returns me the documents day wise. It does return me the day, but say "Wed, 22" and "Wed, 29" are returned as separate aggregation documents.

This is somewhat related to Elasticsearch - group by day of week and hour but there is no answer to that question so I am reposting it. According to the suggestion there it asks me to do term aggregation on key_as_string, but I need to add doc_count for every object instead of just count the terms. I also don't know how to use key_as_string in the nested aggregation.

This is what I have tried:

"aggs" : {
                "posts_over_days" : {
                    "date_histogram" : { 
                        "field" : "created_time", 
                        "interval": "day",
                        "format": "E" 
                    }
                }

Solution

  • The same kind of problem has been solved in this thread.

    Adapting the solution to your problem, we need to make a script to convert the date into the hour of day and day of week:

    Date date = new Date(doc['created_time'].value) ; 
    java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE, HH');
    format.format(date)
    

    And use it in a query:

    {
        "aggs": {
            "perWeekDay": {
                "terms": {
                    "script": "Date date = new Date(doc['created_time'].value) ;java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE, HH');format.format(date)"
                }
            }
        }
    }