structpython-polarspolars

Filling multiple columns at once with polars' map_elements


Following up on this question, can I use polars' map_elements to fill multiple columns at once.

Say I have a custom python function

def my_complicated_function(row):

     # ...

     return value1, value2, value3

and I would like to use this as follows

df = df.with_columns(
    pl.struct(pl.all())
      .map_elements(my_complicated_function, return_dtype=pl.Int64)
      .alias("column1", "column2", "column3")
)

except alias does not make multiple columns at once. Can I use struct here for the output as well, or is there something else?


Solution

  • Your function returns list of values, so you have to call it accordingly:

    df.with_columns(
        pl.struct(pl.all())
          .map_elements(my_complicated_function, pl.List(pl.Int64))
          .list.to_struct(fields=["column1","column2","column3"])
          .alias("result")
    ).unnest("result")
    

    Or, if you can change your function so it returns dictionary:

    def my_complicated_function(row):
    
         # ...
    
         return {"column1": value1, "column2": value2, "column3": value3}
    

    You can get return value into struct directly, so you have names of the columns defined inside of your function and then pl.DataFrame.unnest()

    df.with_columns(
        pl.struct(pl.all())
          .map_elements(my_complicated_function, return_dtype=pl.Struct)
          .alias("result")
    ).unnest("result")