I've a Snowflake table with a GEOGRAPHY
datatype (polygons), which I try to convert to e GeoPandas DataFrame. The Snowpark Dataframe holds the SHAPE
column as (GeographyType
)
Calling
gpd.GeoDataFrame(session.table(table_name).to_pandas(),geometry="SHAPE",crs=4326)
results in an exception
TypeError: Input must be valid geometry objects: {
"coordinates": [
[
[
[
6.706410552876440e+00,
4.658134676959030e+01,
8.919489999999932e+02
],
I came up with an ugly workaround based on Convert pandas dataframe with a geojson-like column to Geopandas dataframe
from ast import literal_eval
from shapely.geometry import shape
df_pd = df_sp.to_pandas()
geom = df_pd.pop("SHAPE").apply(lambda x: shape(literal_eval(x)))
gpd.GeoDataFrame(df_pd, geometry=geom,crs=4326)
Is there a better option?
The idea is to setup GEOGRAPHY_OUTPUT_FORMAT and use shapely.wkt.loads
:
For sample data:
CREATE OR REPLACE TABLE geospatial_table (id INTEGER, SHAPE GEOGRAPHY);
INSERT INTO geospatial_table VALUES
(1, 'POINT(-122.35 37.55)'),
(2, 'LINESTRING(-124.20 42.00, -120.01 41.99)');
SELECT * FROM geospatial_table;
Code:
import geopandas
import shapely
from snowflake.snowpark.context import get_active_session
session = get_active_session()
session.sql("ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKT';").collect()
pd = session.table('geospatial_table').to_pandas()
pd['SHAPE'] = pd['SHAPE'].apply(shapely.wkt.loads)
geopandas.GeoDataFrame(pd,geometry="SHAPE",crs=4326)
Output:
EDIT:
This works unless you have Z-Coordinates, in this case we get MULTIPOLYGONZ from Snowflake which cannot be parsed by shapely.wkt.loads
In that case a WKB may be a better choice:
CREATE OR REPLACE TABLE geospatial_table (id INTEGER, SHAPE GEOGRAPHY);
INSERT INTO geospatial_table VALUES
(1, 'POINTZ(-122.35 37.55 0)'),
(2, 'LINESTRINGZ(-124.20 42.00 0.5, -120.01 41.99 0.5)');
SELECT * FROM geospatial_table;
import geopandas
import shapely
from snowflake.snowpark.context import get_active_session
session = get_active_session()
session.sql("ALTER SESSION SET GEOGRAPHY_OUTPUT_FORMAT='WKB';").collect()
pd = session.table('geospatial_table').to_pandas()
pd['SHAPE'] = pd['SHAPE'].apply(shapely.wkb.loads)
geopandas.GeoDataFrame(pd,geometry="SHAPE",crs=4326)
Output: