pythonpandas

Pandas.DataFrame.query Series.str.startswith Tuple returns Empty


In the following example, the final DataFrame returns empty -- what am I missing?

df = pd.DataFrame({
    'foo':['010','020','030','040','050','060','070','080','090',None],
    'bar':['aaa','bbb','ccc','ddd','eee','fff','ggg','hhh','iii','jjj']
})
df1 = df[df.foo.str.startswith('01',na=False)]
df2 = df[df.foo.str.startswith(('01','03'),na=False)]
df3 = df.query("foo.str.startswith('01',na=False)")
df4 = df.query("foo.str.startswith(('01','03'),na=False)")

Solution

  • Apparently using a @ resolves this issue with a tuple defined before hand like:

    t = ('01', '03')
    
    df.query("foo.str.startswith(@t, na=False)")
    

    output:

        foo bar
    0   010 aaa
    2   030 ccc
    

    From query doc:

    Parameters

    exprstr

        The query string to evaluate.
    
        You can refer to variables in the environment by prefixing 
        them with an ‘@’ character like @a + b.