pythonselenium-webdriverweb-scraping

Scraping data using Network, Fetch, Response


I am trying to scrape the data from the website "https://kgis.ksrsac.in/election/AssemblyEng.aspx?Asmbcd=94" where "Asmbcd=" varies from 1 to 200.

Particularly I want to get the candidates' data for the 2018 Vote details. I could see the data source at Network -> Fethch/XHR -> Response.

I am fairly new to scraping so any help would be very helpful:

import requests
from bs4 import BeautifulSoup

url = "https://kgis.ksrsac.in/election/Election.asmx/GetPollingStationCandidateVotes"
response = requests.get(url)

soup = BeautifulSoup(response.content, "html.parser")
table = soup.find("table", {"id": "GridData"})

rows = table.find_all("tr")
for row in rows:
    cells = row.find_all("td")
    if len(cells) == 3:
        candidate = cells[0].text.strip()
        party = cells[1].text.strip()
        votes = cells[2].text.strip()
        print(candidate, party, votes)

Solution

  • I hope I've understood you correctly. To get all candidates data you can try using their Ajax API:

    import json
    import requests
    import pandas as pd
    
    
    api_url = (
        "https://kgis.ksrsac.in/election/Election.asmx/GetPollingStationCandidateVotes"
    )
    payload = {"ac_code": "95", "PS": "208", "YR": "2018"}
    
    
    all_data = []
    for ac_code in range(1, 2):  # <-- increase from 1 to 200
        ps = 1
        payload['ac_code'] = ac_code
        while True:
            print(ac_code, ps)
            payload['PS'] = ps
            data = requests.post(api_url, json=payload).json()
            data = json.loads(data['d'])
            if not data:
                break
            for name, count in data[0].items():
                all_data.append({
                    'ac_code': ac_code,
                    'booth': ps,
                    'name': name,
                    'count': int(count)
                })
            ps += 1
    
    df = pd.DataFrame(all_data)
    print(df.head(10))
    

    Prints:

       ac_code  booth                             name  count
    0        1    240            KAKSO PANDURANG PATIL    368
    1        1    240               KAMAT RAMESH ISWAR      5
    2        1    240       JOLLE SHASHIKALA ANNASAHEB    173
    3        1    240            ROHINI SHRIMANT DIXIT      0
    4        1    240    VIJAYLAKSHMI APPASAHEB KURANE      0
    5        1    240          SACHIN KALLAPPA MADHALE      0
    6        1    240              ANIL SHIVAJI KAMATE      1
    7        1    240  JAYARAM PARASHRAM MIRAJAKARKAKA      0
    8        1    240           SHARAD RAMAGOUDA PATIL      3
    9        1    240         SAMBHAJI BAPU SOTHORAVAT      2
    

    Prints: