algorithmlanguage-agnosticinterval-tree

Interval tree with added dimension of subset matching?


This is an algorithmic question about a somewhat complex problem. The foundation is this:

A scheduling system based on available slots and reserved slots. Slots have certain criteria, let's call them tags. A reservation is matched to an available slot by those tags, if the available slot's tag set is a superset of the reserved slot.

As a concrete example, take this scenario:

11:00  12:00  13:00
+--------+
| A, B   |
+--------+
       +--------+
       | C, D   |
       +--------+

Between the times of 11:00 to 12:30 reservations for the tags A and B can be made, from 12:00 to 13:30 C and D is available, and there's an overlap from about 12:00 to 12:30.

11:00  12:00  13:00
+--------+
| A, B   |
+--------+
       +--------+
       | C, D   |
       +--------+
  xxxxxx
  x A  x
  xxxxxx

Here a reservation for A has been made, so no other reservations for A or B can be made between 11:15-ish and 12:00-ish.

That's the idea in a nutshell. There are no specific limitations for the available slots:

The only rule that needs to be obeyed in the system is:

To clarify: when there are two available slots at the same time with, say, tag A, then two reservations for A can be made at that time, but no more.

I have that working with a modified implementation of an interval tree; as a quick overview:

When that process is finished, what's left are the remaining slices of available slots, and I can query whether a new reservation can be made for a particular time and add it.

Data structures look something like this:

{
  type: 'available', 
  begin: 1497857244, 
  end: 1497858244, 
  tags: [{ foo: 'bar' }, { baz: 42 }]
}
{
  type: 'reserved', 
  begin: 1497857345, 
  end: 1497857210, 
  tags: [{ foo: 'bar' }]
}

Tags are themselves key-value objects, a list of them is a "tag set". Those could be serialised if it helps; so far I'm using a Python set type which makes comparison easy enough. Slot begin/end times are UNIX time stamps within the tree. I'm not particularly married to these specific data structures and can refactor them if it's useful.


The problem I'm facing is that this doesn't work bug-free; every once in a while a reservation sneaks its way into the system that conflicts with other reservations, and I couldn't yet figure out how that can happen exactly. It's also not very clever when tags overlap in a complex way where the optimal distribution needs to be calculated so all reservations can be fit into the available slots as best as possible; in fact currently it's non-deterministic how reservations are matched to available slots in overlapping scenarios.

What I want to know is: interval trees are mostly great for this purpose, but my current system to add tag set matching as an additional dimension to this is clunky and bolted-on; is there a data structure or algorithm that can handle this in an elegant way?

Actions that must be supported:

  1. Querying the system for available slots that match certain tag sets (taking into account reservations that may reduce availability but are not themselves part of said tag set; e.g. in the example above querying for an availability for B).
  2. Ensuring no reservations can be added to the system which don't have a matching available slot.

Solution

  • Your problem can be solved using constraint programming. In python this can be implemented using the python-constraint library.

    First, we need a way to check if two slots are consistent with each other. this is a function that returns true if two slots share a tag and their rimeframes overlap. In python this can be implemented using the following function

    def checkNoOverlap(slot1, slot2):
        shareTags = False
        for tag in slot1['tags']:
            if tag in slot2['tags']:
                shareTags = True
                break    
        if not shareTags: return True
        return not (slot2['begin'] <= slot1['begin'] <= slot2['end'] or 
                    slot2['begin'] <= slot1['end'] <= slot2['end'])
    

    I was not sure whether you wanted the tags to be completely the same (like {foo: bar} equals {foo: bar}) or only the keys (like {foo: bar} equals {foo: qux}), but you can change that in the function above.

    Consistency check

    We can use the python-constraint module for the two kinds of functionality you requested.

    The second functionality is the easiest. To implement this, we can use the function isConsistent(set) which takes a list of slots in the provided data structure as input. The function will then feed all the slots to python-constraint and will check if the list of slots is consistent (no 2 slots that shouldn't overlap, overlap) and return the consistency.

    def isConsistent(set):
            #initialize python-constraint context
            problem = Problem()
            #add all slots the context as variables with a singleton domain
            for i in range(len(set)):
                problem.addVariable(i, [set[i]])        
            #add a constraint for each possible pair of slots
            for i in range(len(set)):
                for j in range(len(set)):
                    #we don't want slots to be checked against themselves
                    if i == j:
                        continue
                    #this constraint uses the checkNoOverlap function
                    problem.addConstraint(lambda a,b: checkNoOverlap(a, b), (i, j))
            # getSolutions returns all the possible combinations of domain elements
            # because all domains are singleton, this either returns a list with length 1 (consistent) or 0 (inconsistent)
            return not len(problem.getSolutions()) == 0
    

    This function can be called whenever a user wants to add a reservation slot. The input slot can be added to the list of already existing slots and the consistency can be checked. If it is consistent, the new slot an be reserverd. Else, the new slot overlaps and should be rejected.

    Finding available slots

    This problem is a bit trickier. We can use the same functionality as above with a few significant changes. Instead of adding the new slot together with the existing slot, we now want to add all possible slots to the already existing slots. We can then check the consistency of all those possible slots with the reserved slots and ask the constraint system for the combinations that are consistent.

    Because the number of possible slots would be infinite if we didn't put any restrictions on it, we first need to declare some parameters for the program:

    MIN = 149780000 #available time slots can never start earlier then this time
    MAX = 149790000 #available time slots can never start later then this time
    GRANULARITY = 1*60 #possible time slots are always at least one minut different from each other
    

    We can now continue to the main function. It looks a lot like the consistency check, but instead of the new slot from the user, we now add a variable to discover all available slots.

    def availableSlots(tags, set):
        #same as above
        problem = Problem()
        for i in range(len(set)):
            problem.addVariable(i, [set[i]])
        #add an extra variable for the available slot is added, with a domain of all possible slots
        problem.addVariable(len(set), generatePossibleSlots(MIN, MAX, GRANULARITY, tags))
        for i in range(len(set) +1):
            for j in range(len(set) +1):
                if i == j:
                    continue
                problem.addConstraint(lambda a, b: checkNoOverlap(a, b), (i, j))
        #extract the available time slots from the solution for clean output
        return filterAvailableSlots(problem.getSolutions())
    

    I use some helper functions to keep the code cleaner. They are included here.

    def filterAvailableSlots(possibleCombinations):
        result = []
        for slots in possibleCombinations:
            for key, slot in slots.items():
                if slot['type'] == 'available':
                    result.append(slot)
    
        return result
    
    def generatePossibleSlots(min, max, granularity, tags):
        possibilities = []
        for i in range(min, max - 1, granularity):
            for j in range(i + 1, max, granularity):
                possibleSlot = {
                                  'type': 'available',
                                  'begin': i,
                                  'end': j,
                                  'tags': tags
                }
                possibilities.append(possibleSlot)
        return tuple(possibilities)
    

    You can now use the function getAvailableSlots(tags, set) with the tags for which you want the available slots and a set of already reserved slots. Note that this function really return all the consistent possible slots, so no effort is done to find the one of maximum lenght or for other optimalizations.

    Hope this helps! (I got it to work as you described in my pycharm)