pythonpandasdataframeweb-scrapingjupyter-notebook

After scraping data from website and converting csv, excel don't show rows except columns


url ="https://www.dsebd.org/top_20_share.php"
r =requests.get(url) 
soup = BeautifulSoup(r.text,"lxml")
table = soup.find("table",class_="table table-bordered background-white shares-table")
top = table.find_all("th")
header = []
for x in top:
    ele = x.text
    header.append(ele)
    
df = pd.DataFrame(columns= header)
print(df)

row1 =  table.find_all("tr")
row2 =[]
for r1 in row1[1:]:
    ftd= r1.find_all("td")[1].find("a", class_="ab1").text.strip()
    data=r1.find_all("td")[1:]
    r2 = [ele.text for ele in data]
    r2.insert(0,ftd)
    l= len(df)
    df.loc[l]= r2
print(df)

df.to_csv("top_20_share_value_22.csv")

How to see / make data visible after converting to csv and view via excel?

I have gone through above mentioned code.


Solution

  • The script itself works and CSV can be easily imported into Excel. Alternatively, export the data directly .to_excel('your_excle_file.xlsx') and open it in Excel.


    Since you are already operating with pandas, just use pandas.read_html which BeautifulSoup uses in the background to scrape the tables.

    import pandas as pd
    
    df_list = pd.read_html('https://www.dsebd.org/top_20_share.php', match='TRADING CODE')
    
    # first table only
    df_list[0].to_csv("first_top_20_share_value_22.csv")
    
    # all three tables in concat
    pd.concat(df_list, ignore_index=True).to_csv("all_top_20_share_value_22.csv")