I'm trying to calculate the area of my GeoDataFrame with geopandas.area
in squremeters, but the calculated area is very unreasonably small, in the magnitude of e-6.
I've provided sample data and my code. My site is in San Francisco, so I set the crs to be 3857. Then I tried to convert crs to the utm zone based on the answer: https://gis.stackexchange.com/questions/429601/why-are-my-area-calculations-in-python-so-small-with-area who has similar problem as mine, but it still renders small area.
I wonder if there's any other way to find the correct crs for area calculation.
import pandas as pd
import geopandas as gpd
import utm #pip install utm
from pyproj import CRS
from shapely.geometry import Polygon
def footprint_to_polygon(footprint_str):
points = [tuple(list(map(float, point.split(',')))[::-1]) for point in footprint_str.split()]
return Polygon(points)
def findtheutm(aGeometry):
#A function to find a coordinates UTM zone"""
x, y, parallell, latband = utm.from_latlon(aGeometry.centroid.y, aGeometry.centroid.x)
if latband in 'CDEFGHJKLM': #https://www.lantmateriet.se/contentassets/379fe00e09d74fa68550f4154350b047/utm-zoner.gif
ns = 'S'
else:
ns = 'N'
crs = "+proj=utm +zone={0} +{1}".format(parallell, ns) #https://gis.stackexchange.com/questions/365584/convert-utm-zone-into-epsg-code
crs = CRS.from_string(crs)
_, code = crs.to_authority()
return int(code)
data = {
'FootprintPointsStr': [
"37.777289,-122.403477 37.777351,-122.403556 37.777433,-122.403671 37.777382,-122.403745",
"37.776745,-122.40807 37.776437,-122.408476 37.776313,-122.408355 37.776636,-122.40806",
"37.777837,-122.407172 37.777643,-122.406931 37.777532,-122.407089 37.777748,-122.407287",
"37.776093,-122.408003 37.77624,-122.407812 37.776171,-122.407729 37.776017,-122.407935",
"37.774312,-122.412135 37.77462,-122.41251 37.774707,-122.41242 37.774378,-122.41205"
]
}
# Create a DataFrame
df = pd.DataFrame(data)
df['geometry'] = df['FootprintPointsStr'].apply(footprint_to_polygon)
gdf = gpd.GeoDataFrame(df, geometry='geometry')
gdf.set_crs(epsg=3857, inplace=True)
gdf['area'] = gdf['geometry'].area
gdf['area2'] = gdf.to_crs(epsg=findtheutm(gdf.geometry.iloc[0])).area
gdf
"My site is in San Francisco, so I set the crs to be 3857."
Well, your assumption is the whole problem. You're dealing with lat/lon coordinates (i.e, EPSG:4326) and not projected ones (i.e, EPSG:3857).
gdf = gpd.GeoDataFrame(df, crs="EPSG:4326")
gdf["area (utm)"] = gdf.to_crs(epsg=findtheutm(gdf.geometry.iloc[0])).area
gdf["area (gpd)"] = gdf.to_crs(gdf.estimate_utm_crs()).area
NB: geopandas has a useful estimate_utm_crs
with a default WGS 84 datum.
Output (gdf
) :
FootprintPointsStr geometry area (utm) area (gpd)
0 37.777289,-122.403477 37.7... POLYGON ((-122.40348 37.77... 103.581864 103.581864
1 37.776745,-122.40807 37.77... POLYGON ((-122.40807 37.77... 600.906279 600.906279
2 37.777837,-122.407172 37.7... POLYGON ((-122.40717 37.77... 487.886274 487.886274
3 37.776093,-122.408003 37.7... POLYGON ((-122.408 37.7760... 251.645312 251.645312
4 37.774312,-122.412135 37.7... POLYGON ((-122.41214 37.77... 550.760413 550.760413