I have a geopandas GeoDataFrame with some attribute columns and a geometry column (just a regular GDF). Usually I save GDF's as GeoPackage giles (.gpkg) using:
gdf.to_file('path_to_file.gpkg', driver='GPKG')
This works fine, unless my GDF has a column where the entries are arrays. So say I have two columns next to the geometry column and one of them contains a numpy array for each entry. If I then try to save as a gpkg it gives me the error:
ValueError: Invalid field type <class 'numpy.ndarray'>
So it appears that a gpkg cannot handle arrays in the table. The arrays I want to include are simple flags (so values of 0 and 1). I found two workarounds which work alright but are a bit messy:
Does anybody know of a better workaround to this issue?
I believe this is just a limitation of the .gpkg format. However, I think the best workaround approach is to store the arrays as strings, like you suggested. You can easily convert them back into arrays in news gdf if you need to with ast literal_eval().
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import LineString, Point
from ast import literal_eval
gdf = gpd.GeoDataFrame({'id': [1, 2, 3], 'array_col': [np.array([0,1,2]), np.array([0,1,2]), np.array([0,1,2])]},
geometry=[LineString([(1, 1), (4, 4)]),
LineString([(1, 4), (4, 1)]),
LineString([(6, 1), (6, 6)])])
gdf['array_col'] = gdf['array_col'].apply(lambda x: str(x))
gdf.to_file('path_to_file.gpkg', driver='GPKG')
gpkg = gpd.read_file('path_to_file.gpkg')
gpkg['array_col'] = gpkg['array_col'].apply(lambda x: np.array(literal_eval(x.replace(' ', ','))))
After this, we can access our np arrays again.
print(gpkg['array_col'][0])
array([0, 1, 2])