pythonpandasreplacegeopandaswkt

replace string with parentheses keeps the slash \


I have table in geodataframe w hich I have changed into geopandas and it contains two fields: ID and geometry.

The geometry columns is in WKT format and the table looks like this:

>>>ID       geometry
0  1       POLYGON((2.9544435 6.3245124, 2.4098938 6.42657389...
1  2       POLYGON((3.4324624 6.8735201, 2.4590825 6.23098357...
...

I'm trying to replace the parentheses and the POLYGON , so instead of haveing format of POLYGON(()) it will be MultiPolygon((())).

I have changed my dataframe from geopandas to pandas and then tried with replace:

covex['geometry']=covex['geometry'].replace({'POLYGON':'MultiPolygon'},regex=True)
covex['geometry']=covex['geometry'].replace({'\(\(':'\(\(\('},regex=True)
covex['geometry']=covex['geometry'].replace({'\)\)':'\)\)\)'},regex=True)

but for some reason the replace keeps the ****, e.g:

>>>ID       geometry
0  1       MULTIPOLYGON \(\(\(2.9544435 6.3245124, 2.4098938 6.42657389...
1  2       MULTIPOLYGON \(\(\(3.4324624 6.8735201, 2.4590825 6.23098357...
...

If I don't put the \ it doesn't replace anything and I get the following error message:

error: missing ), unterminated subpattern at position 1

My end goal here is to replace the polygon and the (( )) into multipolygon ((()))


Solution

  • I know you are asking about regex to replace the WKT representation of your geometries, but if you want instead to actually convert these polygons to multipolygons (which seems less unusual to me), you can create shapely MultiPolygons from your polygons, for example by using the apply method of your geodataframe :

    from shapely.geometry import MultiPolygon
    
    covex.geometry = covex.geometry.apply(lambda g: MultiPolygon([g]))
    

    After that, when displaying the geometry column, you will now get the actual WKT representation of your multipolygons.