pythonpandaslocate

locate several column with specific values in pandas


Hel lo, I know how to use

`df2.loc[df2[2]=="gene"]` for instance 

but let say I get from that:

                0         1     2        3        4     5  6  7   8
0  NW_011626968.1  AUGUSTUS  gene  1671821  1675615  0.04  -  .  g1
7  NW_011626968.1  AUGUSTUS  gene  1677196  1679599  0.13  +  .  g2

How can I also add the information of the col2 and 3 to match a specific values and get only the line 7 ?

I tried :

df2.loc[df2[2]=="gene"]  & df2.loc[df2[3]=="1677196"] & df2.loc[df2[4]=="1679599"]

Solution

  • Here is necesary chain masks, not filtered rows, also are added () because priority of operators:

    #if columns are filled by strings
    #df3 = df2[(df2[2]=="gene") & (df2[3]=="1677196") & (df2[4]=="1679599")]
    
    #if columns are filled by numbers
    df3 = df2[(df2[2]=="gene") & (df2[3]==1677196) & (df2[4]==1679599)]
    print (df3)
                    0         1     2        3        4     5  6  7   8
    7  NW_011626968.1  AUGUSTUS  gene  1677196  1679599  0.13  +  .  g2