geocoder.osm()
Is an API function that is supposed to take two arguments: latitude and longitude, and then it returns the country name and all of its informations as a json file.
I have a big dataframe of 700k rows full of coordinates i wrote the following code to extract every coordinate's Country name:
import geocoder
import itertools
count=itertools.count(start=0)
def geo_rev(x):
print('starting: ',next(count))
g = geocoder.osm([x.latitude, x.longitude], method='reverse').json
try:
if g:
return [g.get('country'),g.get('city')]
else:
return ['no country','no city']
except ValueError:
pass
data[['Country','City']]=data[['latitude','longitude']].apply(geo_rev,axis=1,result_type='expand')
as you see we are passing a list of two values for every row: [x.latitude, x.longitude]
.
the problem is: this code will take it forever to execute, that is why I want to pass a list of lists for the function geocoder.osm()
to make the request even faster, my idea is to perform the following code:[list[latitude...],list[longitude...] ]
, how to do it?
TypeError: float() argument must be a string or a number, not 'list'
But if my idea (about passing a list of lists) is wrong, if there are another way to make an API call faster please tell me.
I have found an answer to my question, it looks very hard to do it using list of lists then i tried using Threading , Threading executes for APIs like asyncio at very high speed probably even ten times or twenty times faster
it doesn't wait for every request to receive its file but it sends couple of requests at the same time, and then it receive thier files at the same time, the following code will worked right:
import geocoder
import itertools
import concurrent.futures
lst=list(zip(data.latitude.tolist(), data.longitude.tolist()))
countries=[]
count=itertools.count(start=0)
def geo_rev(x):
print('starting: ',next(count))
g = geocoder.osm([x[0], x[1]], method='reverse').json
try:
if g:
return g.get('country')
else:
return 'no country'
except ValueError:
pass
with concurrent.futures.ThreadPoolExecutor() as executor:
results=executor.map(geo_rev, lst)
for result in results:
countries.append(result)
data['Country']=[x for x in countries]
Thanks for Corey Schafer for his Video it explains everything.