[SOLVED] Web Scraping Incentive Table

Web Scraping Incentive Table

I am trying to web scrape the Incentive Step Tracker table from the URL below. I am only interested in Small Residential Storage.

I got somewhere close but not exactly the full table. Please help finish my code and transform the result into CSV format so I can save to a local folder.

Here is my code:

# import libraries
from bs4 import BeautifulSoup
import urllib.request
import csv

urlpage='https://www.selfgenca.com/home/program_metrics/'

page = urllib.request.urlopen(urlpage)
# parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page, 'html.parser')
print(soup)

table = soup.find('table',{'class': 'table'}).find_all('tbody',{'data-t': 'Small Residential Storage'})[0]
results = table.find_all('tr')
print(results)

Here is the table I want to scrape:

Ideal Output Table

Solution

I think it can be done with pandas, with these changes to your code above:

import pandas as pd

#get the headers
tab = soup.find('table',{'class': 'table'}).find_all('tr',{'class': 'head-row'})
headers=[]
for h in tab[0].find_all('td'):
   headers.append(h.text)

and create a dataframe

final = []
for res in results:
    tmp = []
    for r in res:
        if not 'NavigableString' in str(type(r)):
            tmp.append(r.text.strip())
    final.append(tmp)

df = pd.DataFrame(final,columns=headers)
df

Output looks like the table you want.