I am dealing with some tricky GTFS from Belgian public transport operator De Lijn, which somehow added belbus (demand-response buses) as a bus route that comes every hour on their GTFS, making some poorly served countryside misleadingly appear as a highly accessible area with excellent public transport connection.
In routes.txt, they are listed as this:
route_id | agency_id | route_short_name | route_long_name | route_desc | route_type | route_url | route_color | route_text_color |
---|---|---|---|---|---|---|---|---|
61135 | 1 | 460 | Belbus Vlaamse Ardennen | Belbus Vlaamse Ardennen/Belbus Vlaamse Ardennen | 3 | FFFFFF | 000099 |
I really want to know how I can filter any routes with "Belbus" in their route_desc or route_long_name.
At first I tried to just find them on Excel, delete them, and save it into routes.txt, but of course it didn't work when I calculated stop-level frequency on ArcGIS, since I suppose it just looks at stop_times.txt and does not check if the data in Routes.txt went missing.
I also used gtfstools to try to filter it by route_type, but it was either take all buses out or not unfortunately.
{gtfstools}
maintainer here.
What I'd do:
library(gtfstools)
path <- "path_to_gtfs.zip"
gtfs <- read_gtfs(path)
# select route ids whose route_long_name includes "Belbus"
selected_routes <- gtfs$routes[grepl("Belbus", route_long_name)]$route_id
# filter them out of the gtfs object
filtered_gtfs <- filter_by_route_id(gtfs, selected_routes, keep = FALSE)