I'm aiming to subset a df where the first two string values in a list are the same between two separate columns. Example, the list outlined in first_2
display the values I'm interested in returning. When these values are found between Letters
and Value
, I want to subset these rows.
However, I don't want the rows where AB and DA are found. I'm only after an identical match.
df = pd.DataFrame({
'Letters':('AB','BD','AB','DA','EG','FA'),
'Value':('AB','BC','DA','DA','EH','FA'),
'Position':(1,np.nan,3,4,np.nan,6),
})
first_2 = ['AB','DA']
df1 = df[(~df['Letters'].str[0:1].isin(first_2)) & (df['Value'].isin(first_2))]
intended:
Letters Value Position
0 AB AB 1.0
3 DA DA 4.0
s = df['Letters'].str[:2]
out = df[s.isin(first_2) & s.eq(df['Value'])]
out
Letters Value Position
0 AB AB 1.0
3 DA DA 4.0