pythonhtmlpython-requestsgoogle-image-search

Not quite understanding how to preform a request of googles servers with python requests


My problem just now is not being able to form the request of googles servers properly, I've tried putting in all the request headers that my browser(Chrome) uses but that doesn't seem to work. The end goal for this is to be able to specify a search term, resolution and the file type of jpg in the request and to download the image to a folder. Any suggestions would be welcome and thanks in advance

Heres my code so far:

def funRequestsDownload(searchTerm):
print("Getting image for track ", searchTerm)
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36', 'content-length': bytes(searchTerm, 'utf-8')}
queryStringParameters = {'hl': "en", "tbm": "isch", "source": "hp", "biw":1109, "bih": 475, "q": "SEARCH TERMS", "oq":"meme", "gs_l":"img.3..35i39k1j0l9.21651.21983.0.22205.10.10.0.0.0.0.131.269.2j1.3.0....0...1.1.64.img..7.3.267.0.4mTf5BYtfj8"}
payload = {'value': searchTerm}
url = 'http://www.google.co.uk'
dataDump = requests.get(url, data=payload, headers=headers, "Query String Parameters"=queryStringParameters)
temp = dataDump.content
with open('C:/Users/Jordan/Desktop/Music Program/temp.html', 'w') as file:
    file.write(str(temp))
    file.close
return(temp)
print("Downloaded image for track ", searchTerm)

Side note, I know the only thing im saving is the html of the page, this is because it is returning a bad request page and i want to look at said error.


Solution

  • Google doesn't like people using scraping to access search results. They prefer that you use their API instead.

    The API they offer is called Google Custom Search. It supports searching for images. To use their API, you'll need an adsense account. Use the API key you get from that to make your API calls.

    The URL you'll want to hit is

    searchUrl = "https://www.googleapis.com/customsearch/v1?q=" + \
                 searchTerm + "&start=" + startIndex + "&key=" + key + "&cx=" + cx + \
                 "&searchType=image"
    

    pass that through requests to get a JSON file back with your results.

    Here's some further reading.