pythonjupyter-notebookscipypython-polars

How to apply scipy filter in Polars DataFrame


I'm currently using savgol filter in my DataFrame, which has 3 columns x, y, z With float values of a signal from a micro controller

in pandas it's straight forward

df["x"] = savgol_filter(df["x"], 51, 1)
df["y"] = savgol_filter(df["y"], 51, 1)
df["z"] = savgol_filter(df["z"], 51, 1)

how to implement the same in polars

I tried using this method

df.with_columns(
    (savgol_filter(df["x"], 51, 1)).alias("x"),
    (savgol_filter(df["y"], 51, 1)).alias("y"),
    (savgol_filter(df["z"], 51, 1)).alias("`"),
)

The above code returns Attribute error. AttributeError: 'numpy.ndarray' object has no attribute 'alias'

Edit: My question is fairly simple, what is the right syntax/method to implement a scipy filters in polars DataFrame?.


Solution

  • So there is a general way in polars to use custom functions in polars depending if you want to use them on a single element (map_elements) or on a whole series (map_batches). In your case it would be the whole series. For a more detail explanation of both option you can check out the user guide here

    So depending if you want to change the whole DataFrame or keep the original columns here are the to solutions:

    Data

    import polars as pl
    from scipy.signal import savgol_filter
    import numpy as np
    
    n = 100
    df = pl.DataFrame(
        {
            "x": np.random.randint(0, 100, n),
            "y": np.random.randint(0, 100, n),
            "z": np.random.randint(0, 100, n),
        }
    )
    

    Solution 1

    df.select(pl.all().map_batches(lambda x: savgol_filter(x.to_numpy(), 51, 1)).explode())
    

    Solution 2

    df.with_columns(
        pl.all()
        .map_batches(lambda x: savgol_filter(x.to_numpy(), 51, 1))
        .explode()
        .name.suffix("_savgol")
    )
    

    If you don't want to apply the function on all columns you can replace pl.all() with for example pl.col("x")