[SOLVED] Accessing Indeed through Python

Accessing Indeed through Python

My goal for this python code is to create a way to obtain job information into a folder. The first step is being unsuccessful. When running the code I want the url to print https://www.indeed.com/. However instead the code returns https://secure.indeed.com/account/login. I am open to using urlib or cookielib to resolve this ongoing issue.

import requests
import urllib

data = {
        'action':'Login',
        '__email':'email@gmail.com',
        '__password':'password',
        'remember':'1',
        'hl':'en',
        'continue':'/account/view?hl=en',
       }


response = requests.get('https://secure.indeed.com/account/login',data=data)
print(response.url)

Solution

If you're trying to scrape information from indeed, you should use the selenium library for python.

https://pypi.python.org/pypi/selenium

You can then write your program within the context of a real user browsing the site normally.