pythonpandasdataframe

select dataframe column and replace values by indices if True


Hard to find the right title...here is what I want:

I have a dataframe and a column col1 with values : val1, val2, val3

I want to select the rows with val2 or val3 values for this specific column and replace them with val4 value but not for all of them, just for a "slice" between idx x and y :

import pandas as pd
data = {'col1':["val1","val3","val3","val2","val1","val2","val3","val1"],'col2':["val3","val1","val2","val1","val2","val3","val2","val2"]}
df = pd.DataFrame(data)
df
   col1  col2
0  val1  val3
1  val3  val1
2  val3  val2
3  val2  val1
4  val1  val2
5  val2  val3
6  val3  val2
7  val1  val2

Select rows from col1 with val2 or val3 values :

(df['col1']=="val2") | (df['col1']=="val3")
0    False
1     True
2     True
3     True
4    False
5     True
6     True
7    False

Now I want to replace the first 4 True rows for col1 (rows with index 1 2 3 5) with val4 in order to obtain :

   col1  col2
0  val1  val3
1  val4  val1
2  val4  val2
3  val4  val1
4  val1  val2
5  val4  val3
6  val3  val2
7  val1  val2

I thought something like :

df[((df['col1']=="val2") | (df['col1']=="val3"))==True][0:4] = "val4"

but it doesn't work (not surprise...)

Thought I need to use something like .loc

Thanx for any clue


Solution

  • You can get the rows based on the condition

    condition = (df['col1'] == "val2") | (df['col1'] == "val3")
    

    And then get indices of rows that match the condition

    indices = df[condition].index[:4]
    

    Finally use loc to replace the selected rows with val4

    df.loc[indices, 'col1'] = 'val4'
    

    Output

       col1  col2
    0  val1  val3
    1  val4  val1
    2  val4  val2
    3  val4  val1
    4  val1  val2
    5  val4  val3
    6  val3  val2
    7  val1  val2