pythonschedulingor-toolsconstraint-programmingcp-sat

Google OR tools - train scheduling problem


The problem I am trying to solve is a bit like the employee scheduling one here:

https://github.com/google/or-tools/blob/master/examples/python/shift_scheduling_sat.py

However, there are a few things that I am stuck on and have no idea how to incorporate in to the code. I will explain the issue below.

Problem

I have a fleet of 47 trains that I want to assign to 49 routes each day. The trains should be assigned with the following constraints:

  1. Every train must be used at least once during the day (no train must be idle for the whole day)

  2. Every train must be assigned to at least one route (and max two routes) and every route MUST be covered

  3. The trains final mileage, once assigned to a route, must not exceed 24,800 (i.e. the previous day's cumulative mileage + assigned route mileage <= 24,800). This is probably best understood by looking at the total_km_day_end column in the 3rd table below

  4. Where a train is assigned to two routes in a day, the times of these routes must not overlap

A further constraint I would like to have, but am not precious about is this (let's say it's a soft constraint):

  1. trains with high mileage for the previous day should be assigned to short routes and trains with low mileage for the previous day should be assigned to long routes

I have a data frame for the trains that looks like this. I can pick a date at random and see the cumulative mileage up to the end of the previous day (i.e. 18/9/2018) for each of the 47 trains:

Date      |  Day      |   Train   |  Cum_mileage_prev_day 
----------| --------- | --------- |----------------------  
19/9/18   |  WED      |   T32     |  24,300          
19/9/18   |  WED      |   T11     |  24,200
19/9/18   |  WED      |   T38     |  24,200       
 .          .               .         .            
 .          .               .         .            
19/9/18   |  WED      |   T28     |  600  
19/9/18   |  WED      |   T15     |  200   
19/9/18   |  WED      |   T24     |  100  

And a data frame for the routes that looks like this. Note that a route above 100 km is defined as being long, below this it's short. Of the 49 routes, there are only 6 routes that are short (10 km) - note that only 5 of the short routes are shown below:

Route  |  Start    |   End    |  Total_km   | route_type
------ | --------- | ---------|-------------|-------------   
R11    |  5:00     | 00:00    |  700        | Long    
R32    |  6:00     | 00:50    |  600        | Long   
R16    |  5:20     | 23:40    |  600        | Long   
 .          .           .         .            .
 .          .           .         .            .
R41    |  11:15    | 12:30    |   10        | Short 
R42    |  11:45    | 13:00    |   10        | Short
R43    |  12:15    | 13:30    |   10        | Short 
R44    |  12:45    | 14:00    |   10        | Short
R45    |  13:20    | 14:35    |   10        | Short 

What I want to end up with is something like this where the trains have been assigned 1 or 2 routes and the total mileage is shown for the end of the day (assuming the assigned routes are completed by the train):

Date   |  Day  |   Train|  Cum_mil_prev_day | first_assign | second_assign | total_km_day_end
-------| ------| -------|-------------------|--------------|---------------|----------------
19/9/18|  WED  |   T32  |  24,300           | R41          | R44           | 24,320 
19/9/18|  WED  |   T11  |  24,200           | R42          | R45           | 24,220
19/9/18|  WED  |   T38  |  24,200           | R43          |               | 24,210
 .          .               .         .                  .              .
 .          .               .         .                  .              .
19/9/18|  WED  |   T28  |  600              | R11          |               | 1300
19/9/18|  WED  |   T15  |  200              | R32          |               | 800
19/9/18|  WED  |   T24  |  100              | R16          |               | 700

EDIT/UPDATE (2/8/19):

(NOTE: the code below shows a pared down version of the problem with 6 trains assigned to 8 routes. I have also included constraint 5 in the code.)

Thanks so much to Stradivari and Laurent for their help with this one.

from itertools import combinations
from ortools.sat.python import cp_model


def test_overlap(t1_st, t1_end, t2_st, t2_end):

    def convert_to_minutes(t_str):
        hours, minutes = t_str.split(':')
        return 60*int(hours)+int(minutes)

    t1_st = convert_to_minutes(t1_st)
    t1_end = convert_to_minutes(t1_end)
    t2_st = convert_to_minutes(t2_st)
    t2_end = convert_to_minutes(t2_end)

    # Check for wrapping time differences
    if t1_end < t1_st:
        if t2_end < t2_st:
        # Both wrap, therefore they overlap at midnight
            return True
        # t2 doesn't wrap. Therefore t1 has to start after t2 and end before
        return t1_st < t2_end or t2_st < t1_end

    if t2_end < t2_st:
        # only t2 wraps. Same as before, just reversed
        return t2_st < t1_end or t1_st < t2_end

    # They don't wrap and the start of one comes after the end of the other,
    # therefore they don't overlap
    if t1_st >= t2_end or t2_st >= t1_end:
        return False
    # In all other cases, they have to overlap
    return True



def main():
    model = cp_model.CpModel()
    solver = cp_model.CpSolver()

    # data
    route_km = {
        'R11': 700,
        'R32': 600,
        'R16': 600,
        'R41': 10,
        'R42': 10,
        'R43': 10,
        'R44': 10,
        'R45': 10}


    train_cum_km = {
        'T32': 24_300,
        'T11': 24_200,
        'T38': 24_200,
        'T28': 600,
        'T15': 200,
        'T24': 100}


    route_times = {
        'R11': ('05:00', '00:00'),
        'R32': ('06:00', '00:50'),
        'R16': ('05:20', '23:40'),
        'R41': ('11:15', '12:30'),
        'R42': ('11:45', '13:00'),
        'R43': ('12:15', '13:30'),
        'R44': ('12:45', '14:00'),
        'R45': ('13:20', '14:35')}



    trains = list(train_cum_km.keys())
    routes = list(route_km.keys())
    num_trains = len(trains)
    num_routes = len(routes)

    assignments = {(t, r): model.NewBoolVar('assignment_%s%s' % (t, r))
               for t in trains for r in routes}


    # constraint 1: each train must be used
    for r in routes:
        model.Add(sum(assignments[(t, r)] for t in trains) == 1)

    # constraint 2: each train must do at least one (max two) routes
    for t in trains:
        model.Add(sum(assignments[(t, r)] for r in routes) >= 1)
        model.Add(sum(assignments[(t, r)] for r in routes) <= 2)


    # constraint 3: ensure the end of day cum km is less than 24_800
    # create a new variable which must be in the range (0,24_800)
    day_end_km = {
        t: model.NewIntVar(0, 24_800, 'train_%s_day_end_km' % t)
        for t in trains
    }

    for t in trains:
        # this will be constrained because day_end_km[t] is in domain [0, 24_800]
        tmp = sum(assignments[t, r]*route_km[r] for r in routes) + train_cum_km[t]   
        model.Add(day_end_km[t] == tmp)

    # constraint 4: where 2 routes are assigned to a train, these must not overlap
    for (r1, r2) in combinations(routes, 2):
            if test_overlap(route_times[r1][0], route_times[r1][1], route_times[r2][0], route_times[r2][1]):
                 for train in trains:
                    model.AddBoolOr([assignments[train, r1].Not(), assignments[train, r2].Not()])


    # constraint 5: trains with high cum km should be assigned short routes 
    # and trains with low mileage to long routes

    score = {
              (t,r): route_km[r] + train_cum_km[t]
              for t in trains for r in routes
             }

    for r in routes:
        model.Minimize(sum(assignments[t,r]*score[t,r] for t in trains))


    status = solver.Solve(model)
    assert status in [cp_model.FEASIBLE, cp_model.OPTIMAL]

    for t in trains:
        t_routes = [r for r in routes if solver.Value(assignments[t, r])]
        print(f'Train {t} does route {t_routes} '
              f'with end of day cumulative kilometreage of '
              f'{solver.Value(day_end_km[t])}')


if __name__ == '__main__':
    main()

Output:

Train T32 does route ['R42', 'R45'] with end of day cumulative kilometreage of 24320
Train T11 does route ['R41', 'R44'] with end of day cumulative kilometreage of 24220
Train T38 does route ['R43'] with end of day cumulative kilometreage of 24210
Train T28 does route ['R16'] with end of day cumulative kilometreage of 1200
Train T15 does route ['R32'] with end of day cumulative kilometreage of 800
Train T24 does route ['R11'] with end of day cumulative kilometreage of 800

Solution

  • Off the top of my head, not sure if it is the best way:

    assignments = {
        (route, train): model.NewBoolVar('')
        for route in routes
        for train in all_trains
    }
    

    Every train must be assigned to at least one route (and max two routes)

    for train in all_trains:
        model.Add(sum(assignments[route, train] for route in routes) >= 1)
        model.Add(sum(assignments[route, train] for route in routes) <= 2)
    

    The trains final mileage, once assigned to a route, must not exceed 24,800

    Create a dictionary with the mileage of each route: route_km = {'R11': 700, 'R16': 600} and the cumulative mileage of each train cum_mileage = {0: 24_320, 3: 24_220}

    for train in all_trains:
        model.Add(cum_mileage[train]+sum(
            assignments[route, train]*route_km[route]
            for route in routes
        ) <= 24_800)
    

    Where a train is assigned to two routes in a day, the times of these routes must not overlap

    Create a function that returns True if two routes overlap

    Efficient date range overlap calculation in python?

    And then:

    from itertools import combinations
    for (r1, r2) in combinations(routes, 2):
        if not conflicts(r1, r2):
            continue
        for train in all_trains:
            model.AddBoolOr([assignments[r1, train].Not(), assignments[r2, train].Not()])