I have some working code that computes the area of a city by name:
def get_area_of_city(city_name):
# Fetch the geodataframe for the specified city
city_gdf = ox.geocoder.geocode_to_gdf(city_name, which_result=1)
# Access the geometry (polygon) of the city from the geodataframe
city_polygon = city_gdf['geometry']
# Get the latitude and longitude of the city (centroid of the polygon)
city_latitude, city_longitude = city_polygon.geometry.centroid.y, city_polygon.geometry.centroid.x
# Define the Cylindrical Equal Area (CEA) projection centered at the city
cea_projection = f"+proj=cea +lon_0={city_longitude} +lat_ts={city_latitude} +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs"
# Reproject the polygon to the CEA projection
city_polygon_cea = city_polygon.to_crs(cea_projection)
# Compute the area of the polygon in square meters
area_square_meters = city_polygon_cea.area
# You can also convert the area to square kilometers if needed
area_square_kilometers = area_square_meters / 1000000.0
return area_square_kilometers
Now, I want to adapt this code such that it works with any GeoDataFrame
that contains multiple cities. This code should be able to construct a projection for each city and apply it to the polygon to get the area. How can I do this? I currently have the following code:
def get_area_of_geodataframe(gdf):
# Get a copy of the original GeoDataFrame
gdf_copy = gdf.copy()
# Get the latitude and longitude of the centroid of all geometries in the GeoDataFrame
gdf_copy['city_latitude'] = gdf_copy.geometry.centroid.y
gdf_copy['city_longitude'] = gdf_copy.geometry.centroid.x
# Define the Cylindrical Equal Area (CEA) projection for each geometry
gdf_copy['cea_projection'] = gdf_copy.apply(lambda row: f"+proj=cea +lon_0={row['city_longitude']} +lat_ts={row['city_latitude']} +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs", axis=1)
# Reproject each geometry to the CEA projection
gdf_copy['city_polygon_cea'] = gdf_copy.apply(lambda row: row.to_crs(row['cea_projection']), axis=1)
# Compute the area of each geometry in square meters
gdf_copy['area_square_meters'] = gdf_copy['city_polygon_cea'].area
# Convert the area to square kilometers
gdf_copy['area_square_kilometers'] = gdf_copy['area_square_meters'] / 1000000.0
# Drop the intermediate columns and return the modified GeoDataFrame
gdf_copy = gdf_copy.drop(columns=['city_latitude', 'city_longitude', 'cea_projection', 'city_polygon_cea'])
return gdf_copy
However, the error I revieve is AttributeError: 'Series' object has no attribute 'to_crs'
when I call row.to_crs()
.
How do I have to adapt my code?
I tried to use the above code, but I get an error.
Surprisingly or not, when you take a single row of a geopandas.GeoDataFrame
, it ends up being a pandas.Series
, thus it does not have the method to_crs()
.
type(gdf_copy.iloc[0])
# returns <class 'pandas.core.series.Series'>
There seems to be a debate here on whether or not getting back a pandas.Series
should be the expected behavior. This answer provides an interesting indexing trick (notice the double square-brackets):
type(gdf_copy.iloc[[0]])
# returns <class 'geopandas.geodataframe.GeoDataFrame'>
I don't think we can make use of that trick here though, as apply()
let us work with what is already a pandas.Series
.
Should you really want to go that way, you could probably accomplish that by working directly with shapely and pyproj.
Had it worked the way you were trying to, you would have ended up with shapely objects with different CRS in a single geometry column, which geopandas
does not support. You can have different geometry columns with different CRS though, despite the fact that only one is considered as the main geometry of the GeoDataFrame.
Why not reuse your nice and working get_area_of_city(city_name)
function?
df = pd.DataFrame({'city': ['Lyon', 'Paris', 'Marseille']})
df['area_square_kilometers'] = df.apply(lambda row: get_area_of_city(row['city']), axis=1)
# city area_square_kilometers
# 0 Lyon 47.985400
# 1 Paris 105.390441
# 2 Marseille 242.129201
Hope this helps!