c++algorithmdata-structuressliding-window

Why does the sliding window algorithm not work for this problem statement?


The question is regarding an algorithm which I'm using to solve a problem. The problem statement is: (or view it here)

"Given arrival and departure times of all trains that reach a railway station. Find the minimum number of platforms required for the railway station so that no train is kept waiting. Consider that all the trains arrive on the same day and leave on the same day. Arrival and departure time can never be the same for a train but we can have arrival time of one train equal to departure time of the other. At any given instance of time, same platform can not be used for both departure of a train and arrival of another train. In such cases, we need different platforms."

Example:

Input: n = 6, arr[] = {0900, 0940, 0950, 1100, 1500, 1800}, 
            dep[] = {0910, 1200, 1120, 1130, 1900, 2000}
Output: 3
Explanation: There are three trains during the time 0940 to 1200. So we need minimum 3 platforms.

I'm trying to use an algorithm as follows: I'm creating an array consisting of each train's arrival and dept. time (so for the above test case, something like: [[0900,0910], [0904, 1200], ...] ) Then, I sort it based on the arrival times, and if two trains have the same arrival, they're sorted based on the departing times (ascending order for both).

I'm basically trying to find the maximum number of intervals that intersect. Consider the first interval: it has an arrival time of "a", departure time of "b". Now for any intervals that follow, if it's arrival is before "b", then I would require an extra platform for that train (since it intersects with the first interval). I keep doing this and finding more intervals, until I reach an interval that has an arrival time greater than "b". At this point, the first interval is no longer valid, since that train would've left, and so I increment my first pointer, and this process repeats. At any point of iteration, all the intervals between my "i" and "j" pointers will intersect. Finally, I return the maximum number of intervals that intersected as the answer.

This algorithm works for the initial few test cases, but for some reason it fails for certain other testcases.

The code below is what I've written for this algorithm. You can try to submit it on the above link to get the incorrect testcase (it's quite large, so I don't think I should paste it here. But if you want, I can paste the invalid testcase in the comments).

C++ Code

class Solution{
    public:
    //Function to find the minimum number of platforms required at the
    //railway station such that no train waits.
    static bool comp(pair<int,int>&p1, pair<int,int>&p2) {
        if (p1.first == p2.first) return p1.second < p2.second;
        return p1.first < p2.first;
    }
    int findPlatform(int arr[], int dep[], int n)
    {
        int i = 0, j = 0;
        int mxi = 0;
        vector<pair<int,int>>mp;
        for (int k = 0; k<n; k++) {
            mp.push_back({arr[k], dep[k]});
        }
        sort(mp.begin(), mp.end());
        
        
        while (j < n) {
            while (mp[j].first > mp[i].second) {
                i++;
            }
            mxi = max(mxi, j-i+1);
            j++;
            
        }
        return mxi;
    }
};

I'm not asking for the correct algorithm (as I've already read it and understood it). I just want to know why is this incorrect, and for what small test case would it fail, or if I'm missing any other case.

PS. I'm not very sure of asking a specific problem on stack overflow, but I couldn't find any other relevant website/forum where I could get a quick, reliable answer.

Thanks!


Solution

  • It is incorrect to only consider the interval of a single train at a time. Consider the following 3 trains:

    arrivals   = [  0, 100, 300]
    departures = [500, 200, 400]
    

    The time window during which the first train is at a platform fully covers the time window of the remaining two trains, but the second train leaves before the third train arrives, so its platform may be reused. Hence, only 2 platforms are needed in total, but your code outputs 3.

    The correct solution is just a standard sorting of times and then keeping track of the current number of trains that are at a platform. Since the problem states that 0 <= arr[i] <= dep[i] <= 2359, this can actually be solved in linear time with a method similar to counting sort: just keep track of the number of trains that arrive and depart at each time.

    int findPlatform(int arr[], int dep[], int n) {
        const int MAX_TIME = 2359;
        std::unordered_map<int, int> incomingTrains;
        for (int i = 0; i < n; ++i) ++incomingTrains[arr[i]], --incomingTrains[dep[i] + 1];
        int minStations = 0, currTrains = 0;
        for (int i = 0; i <= MAX_TIME; ++i)
            minStations = std::max(minStations, currTrains += incomingTrains[i]);
        return minStations;
    }