I am trying to scrape the data from the website "https://kgis.ksrsac.in/election/AssemblyEng.aspx?Asmbcd=94" where "Asmbcd=" varies from 1 to 200.
Particularly I want to get the candidates' data for the 2018 Vote details. I could see the data source at Network -> Fethch/XHR -> Response.
I am fairly new to scraping so any help would be very helpful:
import requests
from bs4 import BeautifulSoup
url = "https://kgis.ksrsac.in/election/Election.asmx/GetPollingStationCandidateVotes"
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
table = soup.find("table", {"id": "GridData"})
rows = table.find_all("tr")
for row in rows:
cells = row.find_all("td")
if len(cells) == 3:
candidate = cells[0].text.strip()
party = cells[1].text.strip()
votes = cells[2].text.strip()
print(candidate, party, votes)
I hope I've understood you correctly. To get all candidates data you can try using their Ajax API:
import json
import requests
import pandas as pd
api_url = (
"https://kgis.ksrsac.in/election/Election.asmx/GetPollingStationCandidateVotes"
)
payload = {"ac_code": "95", "PS": "208", "YR": "2018"}
all_data = []
for ac_code in range(1, 2): # <-- increase from 1 to 200
ps = 1
payload['ac_code'] = ac_code
while True:
print(ac_code, ps)
payload['PS'] = ps
data = requests.post(api_url, json=payload).json()
data = json.loads(data['d'])
if not data:
break
for name, count in data[0].items():
all_data.append({
'ac_code': ac_code,
'booth': ps,
'name': name,
'count': int(count)
})
ps += 1
df = pd.DataFrame(all_data)
print(df.head(10))
Prints:
ac_code booth name count
0 1 240 KAKSO PANDURANG PATIL 368
1 1 240 KAMAT RAMESH ISWAR 5
2 1 240 JOLLE SHASHIKALA ANNASAHEB 173
3 1 240 ROHINI SHRIMANT DIXIT 0
4 1 240 VIJAYLAKSHMI APPASAHEB KURANE 0
5 1 240 SACHIN KALLAPPA MADHALE 0
6 1 240 ANIL SHIVAJI KAMATE 1
7 1 240 JAYARAM PARASHRAM MIRAJAKARKAKA 0
8 1 240 SHARAD RAMAGOUDA PATIL 3
9 1 240 SAMBHAJI BAPU SOTHORAVAT 2
Prints: