pythonpandasyfinance

Returning a view versus a copy. A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value ins


I downloaded data on prices and volumes of stock trading from Yahoo Finance (yfinance). Then I created Closes dataframe with data on closing prices and volume (I removed unnecessary information). Then I want to add columns to Closes dataframe with calculation of average volume for the year, quarter, month and week. But when executing the code, a warning appears "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead". I read this article "https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy", but I did not understand what I should do in my situation. I want the warning to be eliminated, and the calculations to process correctly in added columns. How do I resolve this warning (not ignore it, but fix it)?

import yfinance as yf
import pandas as pd
import warnings
warnings.filterwarnings("ignore", message="The 'unit' keyword in TimedeltaIndex construction is deprecated and will be removed in a future version. Use pd.to_timedelta instead.", category=FutureWarning, module="yfinance.utils")
warnings.filterwarnings("ignore", category=pd.errors.PerformanceWarning)
#warnings.filterwarnings("ignore", message="DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`", category=pd.errors.PerformanceWarning)

TickersList=['A', 'AAL', 'AAPL', 'ABBV', 'ABNB', 'ABT', 'ACGL', 'ACN', 'ADBE', 'ADI', 'ADM']
Stocks=yf.download(TickersList[0::1], period="1y", interval="1d", group_by='ticker')
Stocks.sort_index(level=0,axis=1,inplace=True)
Closes=Stocks.loc[:, (slice(None), ['Close', 'Volume'])]
for i in Closes.columns.get_level_values(0): 
    Closes.loc[:,(i,'Meam1Y')]=Closes.loc[:,(i,'Volume')].rolling (250).mean() 
    Closes.loc[:,(i,'Meam1Q')]=Closes.loc[:,(i,'Volume')].rolling (62).mean() 
    Closes.loc[:,(i,'Meam1M')]=Closes.loc[:,(i,'Volume')].rolling (20).mean() 
    Closes.loc[:,(i,'Meam1W')]=Closes.loc[:,(i,'Volume')].rolling (5).mean() 
Closes

Warning full text:

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  Closes.loc[:,(i,'Meam1Y')]=Closes.loc[:,(i,'Volume')].rolling (250).mean() 
C:\Users\iiiva\AppData\Local\Temp\ipykernel_27188\26776804.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Solution

  • The warning is caused by these lines:

        Closes.loc[:, (i, 'Meam1Y')] = Closes.loc[:, (i, 'Volume')].rolling(250).mean()
        Closes.loc[:, (i, 'Meam1Q')] = Closes.loc[:, (i, 'Volume')].rolling(62).mean()
        Closes.loc[:, (i, 'Meam1M')] = Closes.loc[:, (i, 'Volume')].rolling(20).mean()
        Closes.loc[:, (i, 'Meam1W')] = Closes.loc[:, (i, 'Volume')].rolling(5).mean()
    

    For some reason, the warning is triggered only when creating a new column in the Closes dataframe, but not if you rerun that for loop after the Meam (careful, there is a typo) columns have been created. To supress the warning, you can explicitly make Closes a copy using:

    Closes = Stocks.loc[:, (slice(None), ['Close', 'Volume'])].copy()