pythonpandasdataframesortingpandas-loc

Pandas .loc with sort_values assignment does not work


I am trying to assign a value and/or a series of values to a slice of a pandas dataframe selected using .loc after sorting values.

For reference, this is the code I am trying to run, with a set string value ('filler')

df.sort_values(['col_1','col_2']).loc[
    df.col_1.isin(rows_selector),col_2] = 'filler'

and with a pandas series containing the different values (the series is of the same length as the output of .loc

df.sort_values(['col_1','col_2']).loc[
    df.col_1.isin(rows_selector),col_2] = filler_series

I would expect the above to assign the desired values in place in df, but that does not seem to happen. Would appreciate any help.


Solution

  • You would need an intermediary step in this for it to work. While the sorted df called before the loc will be sorted, the df used in the loc (df.col_1.isin()) will still be calling from the unsorted df. For example:

    df = pd.DataFrame({'col_1':[1,1,7,5,9,7],
                       'col_2':[4,1,5,6,6,3]})
    
       col_1  col_2
    0      1      4
    1      1      1
    2      7      5
    3      5      6
    4      9      6
    5      7      3
    

    Running df.sort_values(['col_1','col_2']) gives:

       col_1  col_2
    1      1      1
    0      1      4
    3      5      6
    5      7      3
    2      7      5
    4      9      6
    

    However, if you run df.col_1.isin(rows_selector) (where rows_selector = [1, 5, 9]) returns the following:

    0     True
    1     True
    2    False
    3     True
    4     True
    5    False
    Name: col_1, dtype: bool
    

    Notice that the values that are True do not sit on the same indexes as the sort_values output, as it is still checking the original dataframe.

    You should assign the sorted version to be df, or df_sorted and then use .loc on that like this:

    df_sorted = df.sort_values(['col_1','col_2'])
    df_sorted.loc[df_sorted.col_1.isin(rows_selector)]
       col_1  col_2
    1      1      1
    0      1      4
    3      5      6
    4      9      6