pandasdictionarytypeerrordistanceradius

TypeError: cannot convert the series to <class 'float'> when applying a defined function - pandas


I have created the following pandas dataframe:

import pandas as pd
import numpy as np
from math import sin, cos, sqrt, atan2, radians


ds1 = {'Longitude':[-46.6736,-46.50926,-46.75166,-46.54743], "Latitude" : [-23.69057,-23.41165,-23.51482,-23.42598]}
df1 = pd.DataFrame(data=ds1)

which looks like this:

print(df1)

   Longitude  Latitude
0  -46.67360 -23.69057
1  -46.50926 -23.41165
2  -46.75166 -23.51482
3  -46.54743 -23.42598

I need to calculate the distance in KM from a list of Brazilian cities, for which I have latitude and longitude, as follows:

coordinates = {
    "rio" : [-23.02,-43.474889],
    "curitiba" : [-25.38792,-49.27741],
    "portoAlegre" : [-29.98115,-51.19597],
    "salvador" : [-12.97369,-38.43908],
    "manaus" :[-3.012972,-59.926802],
    "campoGrande" : [-20.52243,-54.58743],
    "beloHorizonte" : [-19.79722,-43.95691],
    "portoVelho" : [-8.774148,-63.851237],
    "recife" : [-8.12673,-34.90491],
    "boaVista" : [2.844999,-60.718089],
    "fortaleza" : [-3.76489,-38.51496],
    "rioBranco" : [-9.972341,-67.801294],
    "palmas" : [-10.165953,-48.880833],
    "natal" : [-5.79861,-35.18398],
    "aracaju" : [-10.972717,-37.068985],
    "teresina" : [-5.10247,-42.79552]
}

The distance in KM is calculated by the following function:

def radius(latitude1, longitude1, latitude2, longitude2):
    R = 6373.0

    lat1 = radians(latitude1)
    lon1 = radians(longitude1)
    lat2 = radians(latitude2)
    lon2 = radians(longitude2)
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    distance = R * c
    return distance

Now I calculate the distance of each record in dataframe df1 from Rio de Janeiro ("rio" : [-23.02,-43.474889]):

df1['distanceFromRio'] = radius(coordinates["rio"][0],coordinates["rio"][1],df1['Latitude'],df1['Longitude'])

And I get the following error:

TypeError: cannot convert the series to <class 'float'>

Now, I prefer to avoid to use arrays/lists since I need to do the same for all cities listed in coordinates, so I will need to calculate:

 - distanceFromCuritiba
 - distanceFromPortoAlegre
 - etc.

Does anyone know a way to do it in Python, please?


Solution

  • Your function radius expects 4 floats as arguments, however df1['Latitude'] and df1['Longitude'] are pandas Series, both containing multiple floats.

    Instead trying to pass Series as function arguments, you need to apply your function to each row of the DataFrame using apply method.

    Your radius function looks good, so I suggest adding one more intermediate function handling the application to maintain readability:

    def apply_radius(row, coords):
        return radius(coords[0], coords[1], row["Latitude"], row["Longitude"])
    
    df1["distanceFromRio"] = df1.apply(
        lambda row: apply_radius(row, coordinates["rio"]),
        axis=1
    )
    
    df1["distanceFromRio"]
    

    Output:

    0    335.038388
    1    313.220640
    2    339.317187
    3    317.290975
    Name: distanceFromRio, dtype: float64