duplicatesgtfs

How to properly get stop times for the current day for a bus stop using stop_times.txt


In GTFS, I'm trying to utilize stop_times.txt to find the next available buses that are to arrive at a stop at the current time and current day. However, there are no fields that indicate the time of day a certain stop_time is:

trip_id,arrival_time,departure_time,stop_id,stop_sequence,stop_headsign,pickup_type,drop_off_type,shape_dist_traveled

A bus stop has so many close stop_times for one particular bus arrival that it doesn't make sense. Only logical conclusion is that they're meant for a certain day of the week, but there's nothing to indicate that.

This is causing me to get duplicate or extremely close stop times for each bus that arrives around the current time..


Solution

  • Found out how to filter out stop times not meant for the current day.

    Each stop has one to many stop times. Each stop time is associated with a trip. Each trip has attached a service id. Each service id has a calendar row (calendar.txt) and zero to many calendar_dates rows (calendar_dates.txt)

    I had to check if the stop time (indirectly associated to a service) runs on the current day of the week in relation to the calendar row, and also the next day for times that are past the 24 hour mark.

    As well, if it doesn't run on the calendar week row, an additional check can be done against calendar_dates which has exceptions for trips running on specific dates. If either of those checks passes, then we include that stop time as upcoming.

    The list is then sorted and rearranged to start near the current time of day.

    Using real-time position data, we then associate the trip ids with our new arrivals list to vehicle ids (if they're not null).

    Additional checks still need to be performed to check real-time updates on certain trips and also service alerts, but that's the jist of it.

    TL;DR Used calendar and calendar_dates to check if each stop_time occurs on the current and next day to get a filtered list appropriately. Sorting based on the current closest time is then performed.