Given a geographical coordinate in U.S., how to find out if it is in Urban or Rural areas?
I have about 10000 geographical coordinates all in the U.S., and I want to use Python + basemap to find out if a point is urban or rural.
I'm not sure which library or shape file to use.
I'll need a function like this:
def is_urban(coordinate):
# use the shapefile
urban = False
return urban
while Ash's comment does get the job done, iterating over the shapes individually is slow, in O(N) time
you can use geopandas to put the shapes into GeoDataFrames, then use sjoin to do a spatial join on the data. geopandas uses trees for this function, so it executes in roughly O(log(n)) time.
instead of several seconds per query, now it's several milliseconds per query, on my machine.(7+ s VS 3 ms)
here's my code:
import shapefile
import geopandas as gpd
shp = shapefile.Reader('tl_rd22_us_uac20.shp') #open the shapefile
all_shapes = shp.shapes() # get all the polygons
all_records = shp.records()
polygons_gpd = gpd.GeoDataFrame(geometry=all_shapes) #convert the polygons into a geodataframe
class Address():
longitude = None
latitude = None
def __init__(self, long, lat):
self.longitude = long
self.latitude = lat
def is_urban(addr):
result = False
points_gpd = gpd.GeoDataFrame(geometry=gpd.points_from_xy([addr.longitude], [addr.latitude])) # point coordinates to geopandas dataframe
pt2poly = gpd.sjoin(points_gpd,polygons_gpd, predicate='within').index_right # for each point index in the points, it stores the polygon index containing that point
if len(pt2poly) >= 1: # if any indicies are found, it's urban
result = True
print('urban')
else:
print('rural')
return result
pt = Address(-87.6298, 41.8781) # Chicago, Il
is_urban(pt)
# prints 'urban'
in addition, the link Ash provided is no longer valid. As of December 2023, you can get the shapefiles from https://www2.census.gov/geo/tiger/TIGER_RD18/LAYER/UAC20/