pythonpython-3.xpandaspython-3.6multiline

pandas dataframe multiline query


Say I have a dataframe

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(10, size=(10,3)), columns=['a', 'b', 'c'])

if I now try to query it using the query method:

this works:

df.query('''a > 3 and b < 9''')

this throws an error:

df.query(
    '''
        a > 3 and
        b < 9
    '''
)

I tried many variations of multiline strings but the result is always the following error:

~/ven/lib/python3.6/site-packages/pandas/core/computation/eval.py in eval(expr, parser, engine, truediv, local_dict, global_dict, resolvers, level, target, inplace)
    306     if multi_line and target is None:
    307         raise ValueError(
--> 308             "multi-line expressions are only valid in the "
    309             "context of data, use DataFrame.eval"
    310         )

ValueError: multi-line expressions are only valid in the context of data, use DataFrame.eval

Does anyone know how to make it work? The problem is that in reality I have a very long query to do and it would be very inconvenient having to write all in one line. I know I could use boolean indexing instead but my question is only about how to use multiline with the query method.

Thank you


Solution

  • Use multi-line char backslash ( \ )

    Ex:

    df = pd.DataFrame(np.random.randint(10, size=(10,3)), columns=['a', 'b', 'c'])
    print(df.query(
        '''
            a > 3 and \
            b < 9
        '''
    ))