pythonpython-3.xgoogle-mapsweb-scrapinggoogle-geocoding-api

How to get addresses from dataset having coordinates using Google API?


The data set has 9975 latitudes and longitudes. I want to extract addresses. I have written the following code:

import numpy as np
from bs4 import BeautifulSoup
import urllib.request
import json

coordinates=coordinates.as_matrix()
address=[]
for i in range(len(coordinates)):
    qpage = 'https://maps.googleapis.com/maps/api/js/GeocodeService.Search?5m2&1d'+str(coordinates[i][0])+'&2d'+str(coordinates[i][1])+'&7sUS&9sen&callback=_xdc_._jhwtgt&key=MY_API_KEY&token=53066'
    page = urllib.request.urlopen(qpage)
    data = page.read().decode('utf-8').replace('(','[').replace(')',']')
    data=data[34:]
    js = json.loads(data)
    address.append(js[0]['results'][1]['formatted_address'])

The error I'm getting:

HTTPError Traceback (most recent call last) in () 8 for i in range(len(coordinates)): 9 qpage = 'https://maps.googleapis.com/maps/api/js/GeocodeService.Search?5m2&1d'+str(coordinates[i][0])+'&2d'+str(coordinates[i][1])+'&7sUS&9sen&callback=xdc._jhwtgt&key=MY_API_KEY&token=53066' ---> 10 page = urllib.request.urlopen(qpage) 11 data = page.read().decode('utf-8').replace('(','[').replace(')',']') 12 data=data[34:]

c:\users\anish\appdata\local\programs\python\python36\lib\urllib\request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context) 221 else: 222 opener = _opener --> 223 return opener.open(url, data, timeout) 224 225 def install_opener(opener):

c:\users\anish\appdata\local\programs\python\python36\lib\urllib\request.py in open(self, fullurl, data, timeout) 530 for processor in self.process_response.get(protocol, []): 531 meth = getattr(processor, meth_name) --> 532 response = meth(req, response) 533 534 return response

c:\users\anish\appdata\local\programs\python\python36\lib\urllib\request.py in http_response(self, request, response) 640 if not (200 <= code < 300): 641 response = self.parent.error( --> 642 'http', request, response, code, msg, hdrs) 643 644 return response

c:\users\anish\appdata\local\programs\python\python36\lib\urllib\request.py in error(self, proto, *args) 568 if http_err: 569 args = (dict, 'default', 'http_error_default') + orig_args --> 570 return self._call_chain(*args) 571 572 # XXX probably also want an abstract factory that knows when it makes

c:\users\anish\appdata\local\programs\python\python36\lib\urllib\request.py in _call_chain(self, chain, kind, meth_name, *args) 502 for handler in handlers: 503 func = getattr(handler, meth_name) --> 504 result = func(*args) 505 if result is not None: 506 return result

c:\users\anish\appdata\local\programs\python\python36\lib\urllib\request.py in http_error_default(self, req, fp, code, msg, hdrs) 648 class HTTPDefaultErrorHandler(BaseHandler): 649 def http_error_default(self, req, fp, code, msg, hdrs): --> 650 raise HTTPError(req.full_url, code, msg, hdrs, fp) 651 652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 403: Forbidden

Any help would be appreciated.


Solution

  • The URL that you use

    'https://maps.googleapis.com/maps/api/js/GeocodeService.Search?5m2&1d'+str(coordinates[i][0])+'&2d'+str(coordinates[i][1])+'&7sUS&9sen&callback=_xdc_._jhwtgt&key=YOUR_API_KEY&token=53066'

    this is an internal call of geocoding service from the Google Maps JavaScript API. You shouldn't use the internal URLs, use official web service calls.

    Have a look at Geocoding API documentation and replace the URL with documented reverse geocoding URL:

    'https://maps.googleapis.com/maps/api/geocode/json?latlng='+str(coordinates[i][0])+'%2C'+str(coordinates[i][1])+'&key=YOUR_API_KEY.

    I believe you are getting 403 error, because the token in your request is expired. This token is generated by Maps JavaScript API, so you should use web service call in order to solve the issue.

    Be aware that web services are limited to 50 queries per second.

    In addition I would suggest having a look at Python Client for Google Maps Services. With this library you can easily reverse geocode your coordinates

    import googlemaps
    
    coordinates=coordinates.as_matrix()
    gmaps = googlemaps.Client(key='YOUR_API_KEY')
    
    for i in range(len(coordinates)):
        reverse_geocode_result = gmaps.reverse_geocode((coordinates[i][0], coordinates[i][1]))
    

    I hope this helps!