pythonpandasdataframechained-assignment

Action with pandas SettingWithCopyWarning


I try to delete some column and convert some value in column with

df2.drop(df2.columns[[0, 1, 3]], axis=1, inplace=True)
df2['date'] = df2['date'].map(lambda x: str(x)[1:])
df2['date'] = df2['date'].str.replace(':', ' ', 1)
df2['date'] = pd.to_datetime(df2['date'])

and to all this string I get

  df2.drop(df2.columns[[0, 1, 3]], axis=1, inplace=True)
C:/Users/����� �����������/Desktop/projects/youtube_log/filter.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

What is problem there?


Solution

  • Your df2 is a slice of another dataframe. You need to explicitly copy it with df2 = df2.copy() just prior to your attempt to drop

    Consider the following dataframe:

    import pandas as pd
    import numpy as np
    
    
    df1 = pd.DataFrame(np.arange(20).reshape(4, 5), list('abcd'), list('ABCDE'))
    
    df1
    

    enter image description here

    Let me assign a slice of df1 to df2

    df2 = df1[['A', 'C']]
    

    enter image description here

    df2 is now a slice of df1 and should trigger those pesky SettingWithCopyWarning's if we try to change things in df2. Let's take a look.

    df2.drop('c')
    

    enter image description here

    No problems. How about:

    df2.drop('c', inplace=True)
    

    There it is:

    enter image description here

    The problem is that pandas tries to be efficient and tracks that df2 is pointing to the same data as df1. It is preserving that relationship. The warning is telling you that you shouldn't be trying to mess with the original dataframe via the slice.

    Notice that when we look at df2, row 'c' has been dropped.

    df2
    

    enter image description here

    And looking at df1 we see that row 'c' is still there.

    df1
    

    enter image description here

    pandas made a copy of df2 then dropped row 'c'. This is potentially inconsistent with what our intent may have been considering we made df2 a slice of and pointing to same data as df1. So pandas is warning us.

    To not see the warning, make the copy yourself.

    df2 = df2.copy()
    # or
    df2 = df1[['A', 'C']].copy()