pythonpython-polars

Python polars -- how to apply function across multiple cols


How to extend this

df = df.select(
  pl.col("x1").map_batches(custom_function).alias("new_x1")
)

to something like

df = df.select(
  pl.col("x1","x2").map_batches(custom_function).alias("new_x1", "new_x2")
)

Or the way to go is doing it one by one

df = df.select(
  pl.col("x1").map_batches(custom_function).alias("new_x1")
  pl.col("x2").map_batches(custom_function).alias("new_x2")
)

Solution

  • The syntax

    df.select(
        pl.col("x1", "x2").some_method_chain()
    )
    

    is equivalent to

    df.select(
        pl.col("x1").some_method_chain(),
        pl.col("x2").some_method_chain(),
    )
    

    Especially, your example is almost correct, but fails on the last call to pl.Expr.alias in the method chain [...].alias("new_x1", "new_x2"). You basically try to set the name of each expression to "new_x1", "new_x2". This issue can be fixed using pl.Expr.name.prefix.

    df.select(
        pl.col("x1", "x2").map_batches(custom_function).name.prefix("new_")
    )