I'm using yfinance.download to get data of bunch of stocks, and I'm able to play around with these stocks using the returned pandas dataframe.
I want to drop specific stocks from my dataframe, but I am not able to...
Minimum Code to Reproduce:
# several stocks for testing purposes, get dataframe
tickers = ['AAPL', 'TSLA', 'AMZN', 'GOOGL', 'MSFT', 'META', 'NVDA', 'PYPL', 'ADBE', 'NFLX']
data = yf.download(tickers, period="1y", interval="1wk", group_by='ticker')
print(data.columns.levels[0])
# trying to remove the first one of these level 0 columns...
data = data.drop(columns=data.columns.levels[0][0], axis=1, level=0, inplace=False)
print(data.columns.levels[0])
As you can see, AAPL remains in both the first output and second output. Not sure why it isn't being deleted.
I've tried playing around with inplace=True
as well and not assigning to data, but I still get the same issue.
Let me know if there is anything else I can provide for you guys, thanks in advance.
When you drop columns, pandas doesn't automatically clean up unused levels in the MultiIndex. The levels still contain all original values even if they're no longer used.
If you print the df.columns
you can see that the requested "AAPL" has been removed.
To update the FrozenList that the df.columns.levels
returns you will need to remove the unused levels.
tickers = ['AAPL', 'TSLA', 'AMZN', 'GOOGL', 'MSFT', 'META', 'NVDA', 'PYPL', 'ADBE', 'NFLX']
data = yf.download(tickers, period="1y", interval="1wk", group_by='ticker')
# I have changed the code here for readability.
data = data.drop(columns="AAPL", axis=1, level=0)
data.columns = data.columns.remove_unused_levels()