python-3.xdataframepyqgis

How to join data from Dataframe rows into one single row


This should be easy (I can do it in Excel quickly enough). I have the following Dataframe and I want to join, concatenate, aggregate the XY pairs into one string (and then add text at the start and end).

    Easting     Northing
0   416213.873  6507344.904
1   424283.1    6509982.331
2   428632.58   6496675.102
3   418176.907  6493257.665
4   415338.984  6501940.281
5   416213.873  6507344.904

I have used this line to join each XY pair.

df['WKT'] = df[['Easting', 'Northing']].agg(' '.join, axis=1)

    Easting     Northing    WKT
0   416213.873  6507344.904 416213.873 6507344.904
1   424283.1    6509982.331 424283.1 6509982.331
2   428632.58   6496675.102 428632.58 6496675.102
3   418176.907  6493257.665 418176.907 6493257.665
4   415338.984  6501940.281 415338.984 6501940.281
5   416213.873  6507344.904 416213.873 6507344.904

I experimented with .join and .agg but didn't get what I wanted. I also looked at Pivot but couldn't get that to work either.

What I want to do is get here, where all the Easting/Northing pairs are joined to form a single string with ',' between them - Then add "POLYGON((" to the front and "))" to the end.

POLYGON((X0 Y0,X1 Y1,X2 Y2,X3 Y3,X4 Y4,X5 Y5))

In this case there are 5 points, the 6th is a duplicate of the 1st to close the box. It won't always be a fixed number of points so I looked at iterable loops, but as the basic joining didn't work, the loop produced garbage too.

Secondary part of question referring to LineString. The input file will be like this, but I only import the SOLE,SOLE,EOLE,EOLN:

Line #,Name,SOLE,SOLN,EOLE,EOLN,SOLKP,EOLKP
1,EQ23303-01001-04,421386.910,6499570.855,420145.852,6503366.914,0.000,3.994
2,EQ23303-01005-08,421410.672,6499578.623,420169.615,6503374.683,0.000,3.994
3,EQ23303-01009-12,421434.435,6499586.392,420193.377,6503382.451,0.000,3.994
4,EQ23303-01013-16,421458.197,6499594.161,420217.139,6503390.220,0.000,3.994

The output I am trying to get, for each line, is:

LINESTRING(SOLE SOLN,EOLE EOLN)

Simply, it's just a concatenation of the row with a space between X and Y and a , between the XY pair. I have another way of doing it but seeing the shapely function I wondered if there was a neater method.


Solution

  • IIUC, you can use Polygon from :

    #pip install shapely
    from shapely.geometry import Polygon
    
    p = Polygon([(x, y) for x, y in zip(df["Easting"], df["Northing"])])
    

    Output :

    print(type(p))
    <class 'shapely.geometry.polygon.Polygon'>
    
    print(str(p))
    
    POLYGON ((416213.873 6507344.904, 424283.1 6509982.331, 428632.58 6496675.102, 418176.907 6493257.665, 415338.984 6501940.281, 416213.873 6507344.904))
    

    enter image description here