pythongeopandasshapefile

combine different shapefile feature classes based on their names and geometry


I have a shapefile which contains various feature classes I'm able to read

I improved my code thanks to the suggestions of @Pieter

import geopandas as gpd
from shapely import box
from shapely.geometry.polygon import Polygon
from shapely.geometry.multipolygon import MultiPolygon

shapefile = "LAFIS.shp"
vect_data = gpd.read_file(shapefile)

vec_data.head()
vec_data['DESCR_ENG'].unique()
geometry_arr = (vec_data['geometry'])
descr_name_arr = (vec_data['DESCR_ENG'])
d = {
    "BW_name": descr_name_arr,
    "geometry": geometry_arr,
}
gdf = gpd.GeoDataFrame(d, crs="EPSG:25832")
gdf

test_vector_data = gpd.GeoDataFrame(
    data={"DESCR_ENG": [
        "Cultivated Grassland,Alpe (without tare)",
        "Cultivated Grassland,Biennial cut meadow (tare 20%)/S28-1 Dry meadows and low bog meadows",
        "Cultivated Grassland,Biennial cut meadow (tare 20%)/S28-2 Species-rich mountain meadows",
        "Cultivated Grassland,Biennial cut meadow/S28-2 Species-rich mountain meadows",
        "Cultivated Grassland,Lawn special area (tare 20%)",
        "Cultivated Grassland,Lawn special area (tare 50%)",
        "Cultivated Grassland,Meadow (Half Sheared Tara 20%)",
        "Cultivated Grassland,Meadow (half-sheared)",
        "Cultivated Grassland,Meadow (half-sheared)/S28-1 poor meadows and fen meadows",
        "Cultivated Grassland,Mixed Alternate Meadow",
        "Cultivated Grassland,Pasture",
        "Cultivated Grassland,Pasture (rock 20%)",
        "Cultivated Grassland,Potential pasture (tare 20%)",
        "Fallow,Arable land fallow - EFA",
        "Forest Trees / SRF,Alpe (stocked 20%)",
        "Forest Trees / SRF,Alpe (stocked 50%)",
        "Forest Trees / SRF,Alpeggio (without tares)/S28-6 Wooded pastures",
        "Forest Trees / SRF,Biennial cut meadow (tare 20%)/S28-4 Meadows rich in tree species",
        "Forest Trees / SRF,Biennial cut meadow/S28-4 Species-rich meadows with trees",
        "Forest Trees / SRF,Biennial cut meadow/S28-5 Lush meadows with trees",
        "Forest Trees / SRF,Meadow special area (tare 20%)/S28-4 Meadows rich in wooded species",
        "Forest Trees / SRF,Meadow special area (tare 20%)/S28-5 Lush meadows with trees",
        "Forest Trees / SRF,Meadow special area (tare 50%)/S28-4 Meadows rich in wooded species",
        "Forest Trees / SRF,Meadow special area (tare 50%)/S28-5 Lush meadows with trees",
        "Forest Trees / SRF,Meadow special area/S28-4 Species-rich meadows with trees",
        "Forest Trees / SRF,Meadow special area/S28-5 Lush meadows with trees",
        "Forest Trees / SRF,Pasture (rock 20%)/S28-6 Wooded pastures",
        "Forest Trees / SRF,Pasture (rock 50%)/S28-6 Wooded pastures",
        "Forest Trees / SRF,Pasture (tare 20%)/S28-6 Wooded pastures",
        "Forest Trees / SRF,Pasture (tare 50%)/S28-6 Wooded pastures",
        "Forest Trees / SRF,Pasture (trees 20%)/S28-6 Wooded pastures",
        "Forest Trees / SRF,Pasture/S28-6 Wooded pastures",
        "Forest Trees / SRF,Stable meadow (tare 20%)/S28-4 Meadows rich in wooded species",
        "Forest Trees / SRF,Stable meadow (tare 20%)/S28-5 Lush meadows with trees",
        "Forest Trees / SRF,Stable meadow/S28-4 Species-rich meadows with trees",
        "Forest Trees / SRF,Stable meadow/S28-5 Lush meadows with trees",
        "Forest Trees / SRF,Willow (Tara 50%)",
        "Forest Trees / SRF,Willow (Tare 20%)",
        "Legumes,Alfalfa",
        "Legumes,Clover",
        "Maize,Corn",
        "Miscellaneous,Industrial Medicinal Plants",
        "Miscellaneous,Plant cultivation",
        "No Agriculture,Bosco/S28-8 Peat bogs and alders",
        "No Agriculture,Forest",
        "No Agriculture,Greenhouses",
        "No Agriculture,Hedges",
        "No Agriculture,Hedges/S28-9 Hedges",
        "No Agriculture,Infrastructures",
        "No Agriculture,Other Areas",
        "No Agriculture,Other crops/S28-3 Reedbeds",
        "No Agriculture,Other crops/S28-8 Peat and alder bogs",
        "No Agriculture,Water",
        "Orchards and Berries,Apple",
        "Orchards and Berries,Apricot",
        "Orchards and Berries,Astoni plants fruit",
        "Orchards and Berries,Berry fruit (without strawberry)",
        "Orchards and Berries,Biennial cut meadow (tare 20%)/S28-7 Chestnut groves and meadows with sparse fruit trees",
        "Orchards and Berries,Castagneto/S28-7 Chestnut groves and meadows with sparse fruit trees",
        "Orchards and Berries,Cherry",
        "Orchards and Berries,Chestnut",
        "Orchards and Berries,Currants",
        "Orchards and Berries,Meadow special area (tare 20%)/S28-7 Chestnut groves and meadows with sparse fruit trees",
        "Orchards and Berries,Meadow special area/S28-7 Chestnut groves and meadows with sparse fruit trees",
        "Orchards and Berries,Olive",
        "Orchards and Berries,Orchard being planted",
        "Orchards and Berries,Other fruit",
        "Orchards and Berries,Pear",
        "Orchards and Berries,Plums",
        "Orchards and Berries,Stable meadow (tare 20%)/S28-7 Chestnut groves and meadows with sparse fruit trees",
        "Orchards and Berries,Stable meadow/S28-7 Chestnut groves and meadows with sparse fruit trees",
        "Orchards and Berries,Strawberry",
        "Orchards and Berries,Table grapes",
        "Orchards and Berries,Vineyard under planting",
        "Orchards and Berries,Viticulture",
        "Other Cereals,Grain",
        "Permanent Grassland,Alpe (tare 70%)",
        "Permanent Grassland,Meadow (Permanent Meadow Tara 20%)",
        "Permanent Grassland,Meadow (permanent meadow)",
        "Permanent Grassland,Meadow (permanent meadow)/S28-1 poor meadows and fen meadows",
        "Permanent Grassland,Meadow (permanent meadow)/S28-2 species-rich mountain meadows",
        "Permanent Grassland,Meadow special area",
        "Permanent Grassland,Meadow special area (tare 20%)/S28-1 Dry meadows and low bog meadows",
        "Permanent Grassland,Meadow special area (tare 20%)/S28-2 Species-rich mountain meadows",
        "Permanent Grassland,Meadow special area (tare 50%)/S28-1 Dry meadows and low bog meadows",
        "Permanent Grassland,Meadow special area (tare 50%)/S28-2 Species-rich mountain meadows",
        "Permanent Grassland,Meadow special area/S28-1 poor meadows and fen meadows",
        "Permanent Grassland,Meadow special area/S28-2 species-rich mountain meadows",
        "Permanent Grassland,Potential pasture (50% tare)",
        "Permanent Grassland,Stable meadow (tare 20%)/S28-1 Dry meadows and meadows with low bog",
        "Permanent Grassland,Stable meadow (tare 20%)/S28-2 Species-rich mountain meadows",
        "Vegetables,Asparagus",
        "Vegetables,Cabbage",
        "Vegetables,Cauliflower",
        "Vegetables,Field vegetable cultivation",
        "Vegetables,Radish",
        "Vegetables,Salads",
    ]},
    geometry = [box(0, 0, 5, 5), box(5, 0, 10, 5), box(10, 0, 15, 5), box(15, 0, 20, 5)],
    crs=25832,
)
# Add new "Classes" column based on DESCR_ENG column
test_vector_data["Classes"] = None
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Cultivated"), "Classes"] = "Cultivated Grassland"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Fallow"), "Classes"] = "Fallow"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Forest"), "Classes"] = "Forest Trees"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Legumes"), "Classes"] = "Legumes"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Maize"), "Classes"] = "Maize"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Miscellaneous"), "Classes"] = "Miscellaneous"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("No Agriculture"), "Classes"] = "No Agriculture"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Orchards and Berries"), "Classes"] = "Orchards and Berries"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Other Cereals"), "Classes"] = "Other Cereals"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Permanent Grassland"), "Classes"] = "Permanent Grassland"
test_vector_data.loc[test_vector_data["DESCR_ENG"].str.startswith("Vegetables"), "Classes"] = "Vegetables"

# Print result
print(test_vector_data.head())


test_vector_data["geometry"] = [MultiPolygon([feature]) if isinstance(feature, Polygon) \
    else feature for feature in test_vector_data["geometry"]]

Each feature has its polygon in the geometry column.

I would like to collapse, for example, all the feature classes into one unique class listed in the first part of the name before the comma i.e. all those starting with "Cultivated Grassland" in Cultivated Grassland; "Forest Trees / SRF" in Forest Trees / SRF The geometry of each feature should follow this criteria too to have one unique polygon for each grouped feature

Here you can find the file I'm using shapefile

Any suggestions?


Solution

  • For the different things you are trying to do:

    1. You can use the standard pandas functionalities on a GeoDataFrame to add a column with new classes. After adding a new empty column you can update the values based on the existing "DESCR_ENG" column using loc.
    2. Merging the geometries of all features with the same class is possible using dissolve.
    3. Saving is possible using the geopandas to_file function.

    Sample script:

    import geopandas as gpd
    from shapely import box
    
    vec_data = gpd.read_file(r"C:\Temp\LAFIS\LAFIS_update.shp")
    """
    # Test data if the shapefile is not available
    vec_data = gpd.GeoDataFrame(
        data={"DESCR_ENG": [
            "Meadow (Half Sheed Tara 20%)",
            "Meadow (permanent meadow)",
            "Pasture (tare 20%)/S28-6 Wooded pastures",
            'Grain',
        ]},
        geometry=[box(0, 0, 5, 5), box(5, 0, 10, 5), box(10, 0, 15, 5), box(15, 0, 20, 5)],
        crs=31370,
    )
    """
    
    # Add new "Classes" column based on DESCR_ENG column
    vec_data["Classes"] = None
    vec_data.loc[vec_data["DESCR_ENG"].str.startswith("Meadow"), "Classes"] = "Agriculture"
    vec_data.loc[vec_data["DESCR_ENG"].str.startswith("Pasture"), "Classes"] = "Cultivated"
    
    vec_data_dissolved = vec_data.dissolve(by="Classes")
    
    # Print result
    print(vec_data.head())
    print(vec_data_dissolved.head())
    
    # Save to file
    #vec_data.to_file(r"C:\Temp\LAFIS\LAFIS_update_dissolved.shp")