pythondataframefinanceyfinance

Download S&P 500 firms' stock data AND their GICS identifier in Python


I am trying to use yfinance to download financial data from S&P's 500 firms. However, I would also like to include the GICS sector code for each firm, in order to then divide the data into smaller dataset according to the GICS code. This is my attempt:

import pandas as pd 
import yfinance as yf
import datetime
payload=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
first_table = payload[0]
df = first_table
df.head()
tickers_symbols=df['Symbol'].values.tolist()
GICS_sectors = df['GICS Sector'].values.tolist()
GICS=pd.DataFrame(GICS_sectors)
data = yf.download(tickers_symbols, period='1mo')
data['GICS']=GICS
print(data.head)
data.to_csv('stock_prices.csv')

However, I noticed that, besides giving me an empty list when I save the output in a CSV file, it yields a column vector, which is incompatible with the stock dataframe which as days as rows and firms as columns. Any idea on how to solve this issue? Solutions including other packages are also welcome.


Solution

  • Although it is not as fast as getting all the stocks, getting the values for each stock in an empty data frame and adding the stock name and industry category is an easy format (vertical format) to complete for future analysis.

    import pandas as pd 
    import yfinance as yf
    import datetime
    
    payload = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
    df = pd.DataFrame(payload[0])
    tickers_symbols = df['Symbol'].values.tolist()
    GICS_sectors = df['GICS Sector'].values.tolist()
    
    data = pd.DataFrame()
    for t,s in zip(tickers_symbols, GICS_sectors):
        tmp = yf.download(t, period='1mo', progress=False)
        tmp.reset_index(inplace=True)
        tmp['Ticker'] = t
        tmp['GICS'] = s
        data = data.append(tmp, ignore_index=True)
    
    data.to_csv('stock_prices.csv', sep=',')
    
        Date    Open    High    Low     Close   Adj Close   Volume  Ticker  GICS
    0   2021-08-09  197.649994  198.149994  196.779999  197.429993  195.934311  1193300.0   MMM     Industrials
    1   2021-08-10  198.240005  199.490005  197.699997  199.250000  197.740524  1598400.0   MMM     Industrials
    2   2021-08-11  200.000000  201.770004  199.309998  201.570007  200.042969  2217400.0   MMM     Industrials
    3   2021-08-12  201.479996  202.369995  200.360001  201.429993  199.904007  1231800.0   MMM     Industrials
    4   2021-08-13  201.229996  201.710007  200.289993  200.580002  199.060455  1910700.0   MMM     Industrials