algorithmgreedygraph-coloring

Greedy algorithm: Interval coloring


In interval scheduling, the algorithm is to pick the earliest finish time. But in interval colouring the former does not work. Is there an example or explanation on why picking earliest finish time won't work for interval colouring?

The interval colouring problem is: 
given
 a 
set 
of 
intervals,
 we 
want 
to 
colour all
 intervals
 so 
that 
intervals
 given
 the
 same
 colour
 do 
not 
intersect
 and 
the
goal
 is 
to 
try 
to
 minimize 
the 
number 
of 
colours 
used. This can be thought of as the interval partitioning problem (if it makes more sense)

The interval scheduling problem that i'm referring to is: If you go to a theme park and there are many shows, the start and finish time of each show is an interval, and you are the resource. You want to attend as many shows as possible.


Solution

  • If you only need a counter example of greedy algorithm on coloring, @btilly provides one already.

    I am trying to give reasons to make it more intuitive.

    First, for scheduling problem, you can indeed prove greedy algorithm works. The idea is like this:

    I CANNOT get better result if I'm NOT choosing the show having the earliest finish time, let's see.

    If there is two intervals A, B, with A has earlier finish time, then B is either

    1. Start time later than A's finish time, then no conflict at all, why not both?
    2. Start time earlier than A's finish time, there is conflict, I can only choose A OR B, however, A ends earlier, it gives a higher chance to pick more shows afterwards, no?

    For coloring problem, however, it is totally another category of problem.

    You are forced to pick ALL intervals, while the answer of the problem is THE MAXIMUM # of CONFLICTED INTERVALS OF ALL TIME.

    Try to think like this: For all time, there are MAXIMUM 5 exams happening at the same time, you have AT LEAST to use 5 classrooms (colors), right?

    So we cannot find this with choosing earliest finish time, the time did not tell you anything.

    It may help you to decide whether you PICK or not PICK this interval (like in scheduling problem), but cannot tell you the MINIMUM # of resources you need. They are just different category of problem.

    EDITED:

    After re-reading OP's question, here's more details as far as I know about the coloring problem.

    Define depth be the maximum # of conflicts at all time. Logically we know depth is the lower bound, but we have to proof it is the upper bound as well (by contradiction).

    Proof

    The proof needs to SORT INTERVALS BY START TIME IN ASCENDING ORDER or FINISH TIME IN DESCENDING ORDER, as shown follow:

    Assume the depth of the interval set is d, and the answer is greater than d. Let x be the first interval we process that is using resources d+1, as the processing order is sort by start time ascending, it means there is at least d intervals that start before x and has conflict with x, then the depth of the set is at least d+1, contradiction. So d = depth is also the upper bound of the answer, it is the optimal answer of interval coloring.

    Note that if you sort by start time descending, or finish time ascending, then you cannot use the same reasoning.

    Concepts / Goal

    Now we know depth is the answer, we have to find it. Concept-wise, it DOES NOT matter if you find by using start time or finish time, ascending or descending, all options can give you the depth of the interval set.

    Implementation Consideration

    However, for implementation, if you have to find it in O(n lg n), you will have to make use of GREEDY METHOD + some DATA STRUCTURES, which probably need the intervals have some kind of ordering. But that's another story, it's for implementation, concept-wise, it does not matter, you only want to find the depth of the interval set.

    TL;DR

    For interval scheduling problem, the greedy method indeed itself is already the optimal strategy; while for interval coloring problem, greedy method only help to proof depth is the answer, and can be used in the implementation to find the depth (but not in the way as shown in @btilly's counter example)