I'm currently using savgol filter in my DataFrame, which has 3 columns x, y, z With float values of a signal from a micro controller
in pandas it's straight forward
df["x"] = savgol_filter(df["x"], 51, 1)
df["y"] = savgol_filter(df["y"], 51, 1)
df["z"] = savgol_filter(df["z"], 51, 1)
how to implement the same in polars
I tried using this method
df.with_columns(
(savgol_filter(df["x"], 51, 1)).alias("x"),
(savgol_filter(df["y"], 51, 1)).alias("y"),
(savgol_filter(df["z"], 51, 1)).alias("`"),
)
The above code returns Attribute error. AttributeError: 'numpy.ndarray' object has no attribute 'alias'
Edit: My question is fairly simple, what is the right syntax/method to implement a scipy filters in polars DataFrame?.
So there is a general way in polars to use custom functions in polars depending if you want to use them on a single element (map_elements
) or on a whole series (map_batches
). In your case it would be the whole series. For a more detail explanation of both option you can check out the user guide here
So depending if you want to change the whole DataFrame or keep the original columns here are the to solutions:
import polars as pl
from scipy.signal import savgol_filter
import numpy as np
n = 100
df = pl.DataFrame(
{
"x": np.random.randint(0, 100, n),
"y": np.random.randint(0, 100, n),
"z": np.random.randint(0, 100, n),
}
)
df.select(pl.all().map_batches(lambda x: savgol_filter(x.to_numpy(), 51, 1)).explode())
df.with_columns(
pl.all()
.map_batches(lambda x: savgol_filter(x.to_numpy(), 51, 1))
.explode()
.name.suffix("_savgol")
)
If you don't want to apply the function on all columns you can replace pl.all()
with for example pl.col("x")