pythonasp.netpython-requestsdopostback

.aspx site Web scraping using python


I am attempting to scrape locations from here: https://ukcareers.northropgrumman.com/vacancies/vacancy-search-results.aspx

I found similar thread (match my case) from here: Web scraping from .aspx site using python using python by Andrej Kesely, wolf7687. I've followed the same for my case. Actually the site which I am attempting contains 5Pages. During scraping I supposed to get locations from all the five pages but I am getting first page result 5times. I've played with adjusting the headers and a bunch of other stuff but not gotten any success. I am fairly certain the problem lies in the viewstate and viewgenerator header parameters. I've read other posts related to .aspx and haven't seen anything that applies to my situation. Would really appreciate any help on this!!

I am unfortunately currently limited to using only requests or other popular python libraries.

Thanks in advance..


Solution

  • Inside your for loop you're creating a new Session object - you should only have one (you have one at the start of your code)

    You're also using a .get() request when it should be a .post()

    replace:

    # Getting data from each page
    s = requests.Session()
    headers = {'User-Agent': 'Mozilla/5.0'} #My user agent here
    response = s.get(url, verify=False, headers=headers, data=data)
    

    with:

    response = s.post(url, verify=False, headers=headers, data=data)