I have lots of points with their coordinates. I want to print at least the three closest neighbors of one point and their distance to that point. How can I do that in Python? In WGS84 system.
NAME Latitude Longitude
B 50.94029883 7.019146728
C 50.92073002 6.975268711
D 50.99807758 6.980865543
E 50.98074288 7.035060206
F 51.00696972 7.035993783
G 50.97369889 6.928538763
H 50.94133859 6.927878587
A 50.96712502 6.977825322
Nearest neighbor techniques more efficient for lots of points
Nearest Neighbor in Python
Illustration of using BallTree on your problem (related Related Stackoverflow Post)
Code
import pandas as pd
import numpy as np
from sklearn.neighbors import BallTree
from io import StringIO
# Create DataFrame from you lat/lon dataset
data = """NAME Latitude Longitude
B 50.94029883 7.019146728
C 50.92073002 6.975268711
D 50.99807758 6.980865543
E 50.98074288 7.035060206
F 51.00696972 7.035993783
G 50.97369889 6.928538763
H 50.94133859 6.927878587
A 50.96712502 6.977825322"""
# Use StringIO to allow reading of string as CSV
df = pd.read_csv(StringIO(data), sep = ' ')
# Setup Balltree using df as reference dataset
# Use Haversine calculate distance between points on the earth from lat/long
# haversine - https://pypi.org/project/haversine/
tree = BallTree(np.deg2rad(df[['Latitude', 'Longitude']].values), metric='haversine')
# Setup distance queries (points for which we want to find nearest neighbors)
other_data = """NAME Latitude Longitude
B_alt 50.94029883 7.019146728
C_alt 50.92073002 6.975268711"""
df_other = pd.read_csv(StringIO(other_data), sep = ' ')
query_lats = df_other['Latitude']
query_lons = df_other['Longitude']
# Find closest city in reference dataset for each in df_other
# use k = 3 for 3 closest neighbors
distances, indices = tree.query(np.deg2rad(np.c_[query_lats, query_lons]), k = 3)
r_km = 6371 # multiplier to convert to km (from unit distance)
for name, d, ind in zip(df_other['NAME'], distances, indices):
print(f"NAME {name} closest matches:")
for i, index in enumerate(ind):
print(f"\t{df['NAME'][index]} with distance {d[i]*r_km:.4f} km")
Output
NAME B_alt closest matches:
B with distance 0.0000 km
C with distance 3.7671 km
A with distance 4.1564 km
NAME C_alt closest matches:
C with distance 0.0000 km
B with distance 3.7671 km
H with distance 4.0350 km