pythonschedulerostering

Maximizing a combination of a series of values


This is a complicated one, but I suspect there's some principle I can apply to make it simple - I just don't know what it is.

I need to parcel out presentation slots to a class full of students for the semester. There are multiple possible dates, and multiple presentation types. I conducted a survey where students could rank their interest in the different topics. What I'd like to do is get the best (or at least a good) distribution of presentation slots to students.

So, what I have:

What I'd like to get:

I should note that I realize this looks like a homework problem, but it's real life :-). I was thinking that I might make a Student class for each student that contains the dates for each presentation type, but I wasn't sure what the best way to populate it would be. Actually, I'm not even sure where to start.


Solution

  • TL;DR: I think you're giving your students too much choice :D

    But I had a shot at this problem anyway. Pretty fun exercise actually, although some of the constraints were a little vague. Most of all, I had to guess what the actual students' preference distribution would look like. I went with uniformly distributed, independent variables, although that's probably not very realistic. Still I think it should work just as well on real data as it does on my randomly generated data.

    I considered brute forcing it, but a rough analysis gave me an estimate of over 10^65 possible configurations. That's kind of a lot. And since we don't have a trillion trillion years to consider all of them, we'll need a heuristic approach.

    Because of the size of the problem, I tried to avoid doing any backtracking. But this meant that you could get stuck; there might not be a solution where everyone only gets dates they gave 4's and 5's.

    I ended up implementing a double-edged Iterative Deepening-like search, where both the best case we're still holding out hope for (i.e., assign students to a date they gave a 5) and the worst case scenario we're willing to accept (some student might have to live with a 3) are gradually lowered until a solution is found. If we get stuck, reset, lower expectations, and try again. Tasks A and B are assigned first, and C is done only after A and B are complete, because the constraints on C are far less stringent.

    I also used a weighting factor to model the trade off between maximizing students happiness with satisfying the types-of-presentations-per-day limits.

    Currently it seems to find a solution for pretty much every random generated set of preferences. I included an evaluation metric; the ratio between the sum of the preference values of all assigned student/date combos, and the sum of all student ideal/top 3 preference values. For example, if student X had two fives, one four and the rest threes on his list, and is assigned to one of his fives and two threes, he gets 5+3+3=11 but could ideally have gotten 5+5+4=14; he is 11/14 = 78.6% satisfied.

    After some testing, it seems that my implementation tends to produce an average student satisfaction of around 95%, at lot better than I expected :) But again, that is with fake data. Real preferences are probably more clumped, and harder to satisfy.

    Below is the core of the algorihtm. The full script is ~250 lines and a bit too long for here I think. Check it out at Github.

    ...
    
    # Assign a date for a given task to each student, 
    # preferring a date that they like and is still free.
    def fill(task, lowest_acceptable, spread_weight=0.1, tasks_to_spread="ABC"):
            random_order = range(nStudents) # randomize student order, so everyone
            random.shuffle(random_order)    # has an equal chance to get their first pick
            for i in random_order:
                student = students[i]
                if student.dates[task]: # student is already assigned for this task?
                    continue
    
                # get available dates ordered by preference and how fully booked they are
                preferred = get_favorite_day(student, lowest_acceptable,  
                                             spread_weight, tasks_to_spread)
                for date_nr in preferred:
                    date = dates[date_nr]
                    if date.is_available(task, student.count, lowest_acceptable == 1):
                        date.set_student(task, student.count)
                        student.dates[task] = date
                        break
    
    # attempt to "fill()" the schedule while gradually lowering expectations
    start_at = 5
    while start_at > 1:
        lowest_acceptable = start_at
        while lowest_acceptable > 0:
            fill("A", lowest_acceptable, spread_weight, "AAB")
            fill("B", lowest_acceptable, spread_weight, "ABB")
            if lowest_acceptable == 1:
                fill("C", lowest_acceptable, spread_weight_C, "C")
            lowest_acceptable -= 1
    

    And here is an example result as printed by the script:

                                          Date                                      
    ================================================================================
    Student |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10 |  11 |  12 |
    ================================================================================
       1    |     |  A  |  B  |     |  C  |     |     |     |     |     |     |     |
       2    |     |     |     |     |  A  |     |     |     |     |  B  |  C  |     |
       3    |     |     |     |     |  B  |     |     |  C  |     |  A  |     |     |
       4    |     |     |     |  A  |     |  C  |     |     |     |     |     |  B  |
       5    |     |     |  C  |     |     |     |  A  |  B  |     |     |     |     |
       6    |     |  C  |     |     |     |     |     |     |  A  |  B  |     |     |
       7    |     |     |  C  |     |     |     |     |  B  |     |     |     |  A  |
       8    |     |     |  A  |     |  C  |     |  B  |     |     |     |     |     |
       9    |  C  |     |     |     |     |     |     |     |  A  |     |     |  B  |
       10   |  A  |  B  |     |     |     |     |     |     |  C  |     |     |     |
       11   |  B  |     |     |  A  |     |  C  |     |     |     |     |     |     |
       12   |     |     |     |     |     |  A  |  C  |     |     |     |  B  |     |
       13   |  A  |     |     |  B  |     |     |     |     |     |     |     |  C  |
       14   |     |     |     |     |  B  |     |     |     |  C  |     |  A  |     |
       15   |     |     |  A  |  C  |     |  B  |     |     |     |     |     |     |
       16   |     |     |     |     |     |  A  |     |     |     |  C  |  B  |     |
       17   |     |  A  |     |  C  |     |     |  B  |     |     |     |     |     |
       18   |     |     |     |     |     |     |  C  |  A  |  B  |     |     |     |
    ================================================================================
    Total student satisfaction: 250/261 = 95.00%