I am new to SQL and am using Bigquery. I have a table that has coordinates of longitude and latitude coordinates for bike stations in Chicago. I am trying to match these coordinates to cross streets. The coursera dataset that was provided to me has cross streets for many of the coordinates, but many of them have NULL for cross street names.
I have looked up my question and the closest thing I have found is that you can look to see if the gps point is within a certain geometrical area.
I just want to know if what I am looking to do is possible. If it is a little guidance on the next steps would be great.
This is the query I am starting with:
SELECT start_station_name, ST_GEOGPOINT(start_lng, start_lat) AS point
FROM coursera-practice-356200.Cyclistic_data.Chi_metered
WHERE start_station_name IS NULL
I was able to translate the gps locations as a point, but I am lost as what to do next.
First, you need another table with roads. One public dataset you can use is Open Street Maps, it is available in BigQuery public datasets project as bigquery-public-data.geo_openstreetmap.planet_ways
.
Second, you need to join your table with this table, on some "close-enough" condition - a station being within some number of meters close to the road. This is tricky - if you choose too large threshold, you get a lot of unrelated roads, if you choose too small, you miss the road you want. I chose 50m in the example below. The expression of two geometries being within some distance of each other is ST_DWithin(geo1, geo2, distance)
.
I don't have access to your table, so I used bigquery-public-data.new_york_citibike.citibike_stations
which has citibike data too.
Here I build a list of roads within 50 m of each station:
with stations as (
SELECT name AS station_name, ST_GeogPoint(longitude, latitude) AS loc
FROM `bigquery-public-data.new_york_citibike.citibike_stations`
), roads as (
SELECT geometry,
(select value from w.all_tags where key = 'name') as road_name
FROM `bigquery-public-data.geo_openstreetmap.planet_ways` w
where geometry is not null
and exists(select 1 from w.all_tags where key = 'highway')
)
select
station_name,
array_agg(distinct road_name) as nearby_roads
from stations, roads
where st_dwithin(loc, geometry, 50)
and road_name is not null
group by station_name
The result seems like a good approximation, although there are some unrelated nearby streets as well:
station_name nearby_roads
Delancey St & Eldridge St "[Delancey Street,Eldridge Street]"
Brook Ave & E 157 St "[Brook Avenue,East 157th Street,East 158th Street]"
Ryer Ave & E 182 St "[Ryer Avenue,East 182nd Street]"
Ditmars Blvd & 19 St "[Ditmars Boulevard,19th Street]"
...