I have a created a geopandas dataframe with 50 million records which contain Latitude Longitude in CRS 3857 and I want to convert to 4326. Since the dataset is huge the geopandas unable to convert this.how i can execute this in distributed manner.
df = sdf.toPandas()
gdf = gpd.GeoDataFrame(
df.drop(['Longitude', 'Latitude'], axis=1),
crs={'init': 'epsg:4326'},
geometry=[Point(xy) for xy in zip(df.Longitude, df.Latitude)])
return gdf
result_gdf=convert_crs(grid_df)
See: https://github.com/geopandas/geopandas/issues/1400
This is very fast and memory efficient:
from pyproj import Transformer
trans = Transformer.from_crs(
"EPSG:4326",
"EPSG:3857",
always_xy=True,
)
xx, yy = trans.transform(df["Longitude"].values, df["Latitude"].values)
df["X"] = xx
df["Y"] = yy