pythonpandasgeopandasshapelygeographic-distance

Calculate all distances between two GeoDataFrame (of points) in GeoPandas


This is quite simple case, but I did not find any easy way to do it so far. The idea is to get a set of distances between all the points defined in a GeoDataFrame and the ones defined in another GeoDataFrame.

import geopandas as gpd
import pandas as pd

# random coordinates
gdf_1 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0, 0], [0, 90, 120]))
gdf_2 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0], [0, -90]))
print(gdf_1)
print(gdf_2)

#  distances are calculated elementwise
print(gdf_1.distance(gdf_2))

This produces the element-wise distance between points in gdf_1 and gdf_2 that share the same index (with also a warning because the two GeoSeries do not have the same index, which will be my case).

                geometry
0    POINT (0.000 0.000)
1   POINT (0.000 90.000)
2  POINT (0.000 120.000)
                    geometry
0    POINT (0.00000 0.00000)
1  POINT (0.00000 -90.00000)
/home/seydoux/anaconda3/envs/chelyabinsk/lib/python3.8/site-packages/geopandas/base.py:39: UserWarning: The indices of the two GeoSeries are different.
  warn("The indices of the two GeoSeries are different.")
0      0.0
1    180.0
2      NaN

The question is; how is it possible to get a series of all points to points distances (or at least, the unique combinations of the index of gdf_1 and gdf_2 since it is symmetric).

EDIT


Solution

  • You have to apply over each geometry in first gdf to get distance to all geometric in second gdf.

    import geopandas as gpd
    import pandas as pd
    
    # random coordinates
    gdf_1 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0, 0], [0, 90, 120]))
    gdf_2 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0], [0, -90]))
    
    gdf_1.geometry.apply(lambda g: gdf_2.distance(g))
    
          0      1
    0    0.0   90.0
    1   90.0  180.0
    2  120.0  210.0