ruby-on-railselasticsearchtire

Find open Shops through Timetable with Elasticsearch/Tire


I have model Shop each has relation with Timetable which could contain something like:

shop_id: 1, day: 5, open_hour: 7,  open_minutes: 0,  close_hour: 13, close_minute: 30
shop_id: 1, day: 5, open_hour: 14, open_minutes: 30, close_hour: 18, close_minute: 00

of course Timetable could have more elegant format, but question is next: how with elasticsearch(tire) could I find Shop which is open?

all Idea will be apreciated! Thanks!


Found solution:

  1. create separate index for each day (sunday, monday, ..)

  2. for each day build full array of minutes from Timetable:

    ((open_hour * 60 + open_minute)..(close_hour * 60 + close_minute)).to_a
    
  3. add filter to search:

    filter :term, current_day_name => (current_hour * 60 + current_minutes)
    

this solution works as well, but it looks cumbersome, because if Shop works 8-h hours per day I have created array with size: 8 * 60 = 480 (which is converted to string as indexed field), so thats why this question is still open, and maybe someone will find better solution


Tire part for @Andrei Stefan answer:

indexes :open_hours, type: :nested do
  indexes :open, type: 'integer'
  indexes :close, type: 'integer'
end

open_hours_query = Tire::Search::Query.new do
  filtered do
    query { all }
    filter :range, "open_hours.open"   => { lte: current_time }
    filter :range, "open_hours.close"  => { gte: current_time }
  end
end

filter :nested, { path: 'open_hours', query: open_hours_query.to_hash }

Solution

  • I would consider doing it like the following:

    1. The opening and closing hours are integer values of an array of nested objects in Elasticsearch:

    Example: shop opening at 07:00 and closing at 13:30 and then opening at 14:30 and closing at 18:00 in day 1 would be translated to this in ES:

    "shop_name": "Shop 1",
    "open_hours": [
      { "open": 420, "close": 810 },
      { "open": 870, "close": 1080 }
    ]
    
    1. Each day in the week (1 -> 7) represents a value (to be added to the number of minutes):
    Day 1 = addition 0
    Day 2 = addition 2000
    Day 3 = addition 4000
    ...
    Day 7 = addition 10000
    

    So, for each day there is an increment of 2000 because each day contains at most 1440 minutes (24 hours * 60 minutes) and to be able to differentiate one day from a single number these numbers don't have to intersect.

    So, the example above with the shop opening at 07:00 would be translated for Day 4 for example to this:

    "shop_name": "Shop 1",
    "open_hours": [
      { "open": 6420, "close": 6810 },
      { "open": 6870, "close": 7080 }
    ]
    
    1. When querying these documents, that point of the day you want to search needs to obey the same rules as above. For example, if you want to see if in Day 4 at 13:45 the "Shop 1" is opened you would search for a (6000 + 13*60 + 45 = 6825) minute.

    2. The mapping for everything above in Elasticsearch would be this:

    {
      "mappings": {
        "shop" : {
          "properties": {
            "shop_name" : { "type" : "string" },
            "open_hours" : {
              "type" : "nested",
              "properties": {
                "open" : { "type" : "integer" },
                "close": { "type" : "integer" }
              }
            }
          }
        }
      }
    }
    
    1. Test data:
    POST /shops/shop/_bulk
    {"index":{}}
    {"shop_name":"Shop 1","open_hours":[{"open":420,"close":810},{"open":870,"close":1080}]}
    {"index":{}}
    {"shop_name":"Shop 2","open_hours":[{"open":0,"close":500},{"open":1000,"close":1440}]}
    {"index":{}}
    {"shop_name":"Shop 3","open_hours":[{"open":0,"close":10},{"open":70,"close":450},{"open":900,"close":1050}]}
    {"index":{}}
    {"shop_name":"Shop 4","open_hours":[{"open":2000,"close":2480}]}
    {"index":{}}
    {"shop_name":"Shop 5","open_hours":[{"open":2220,"close":2480},{"open":2580,"close":3000},{"open":3100,"close":3440}]}
    {"index":{}}
    {"shop_name":"Shop 6","open_hours":[{"open":6000,"close":6010},{"open":6700,"close":6900}]}
    
    1. Querying for shops opened in Day 2 at minute #2400 of the day (06:40):
    {
      "query": {
        "bool": {
          "must": [
            {
              "nested": {
                "path": "open_hours",
                "query": {
                  "bool": {
                    "must": [
                      {
                        "filtered": {
                          "filter": {
                            "range": {
                              "open_hours.open": {
                                "lte": 2400
                              }}}}},
                      {
                        "filtered": {
                          "filter": {
                            "range": {
                              "open_hours.close": {
                                "gte": 2400
                              }}}}}
                    ]
                  }}}}
                ]
    }}}
    

    Would output Shop 4 and Shop 5:

             "shop_name": "Shop 4",
               "open_hours": [
                  {
                     "open": 2000,
                     "close": 2480
                  }
               ]
    
             "shop_name": "Shop 5",
               "open_hours": [
                  {
                     "open": 2220,
                     "close": 2480
                  },
                  {
                     "open": 2580,
                     "close": 3000
                  },
                  {
                     "open": 3100,
                     "close": 3440
                  }
               ]
    

    LATER EDIT: since Elasticsearch has come a looong way since I added this reply and many things changed since then, a filtered filter (in the context of the bool must I used) can be replaced by a bool filter or even a simple must. Also, the string doesn't exist in 6.x anymore, so you can use text if you somehow need to search by shop name using analyzers, or keyword ("shop_name" : { "type" : "text" },):

    {
      "query": {
        "bool": {
          "must": [
            {
              "nested": {
                "path": "open_hours",
                "query": {
                  "bool": {
                    "filter": [
                      {
                        "range": {
                          "open_hours.open": {
                            "lte": 2400
                          }
                        }
                      },
                      {
                        "range": {
                          "open_hours.close": {
                            "gte": 2400
                          }
                        }
                      }
                    ]
                  }
                }
              }
            }
          ]
        }
      }
    }