I have a function which returns a list type column. Hence, one of my columns is a list. I'd like to turn this list column into multiple columns. For example:
use polars::prelude::*;
use polars::df;
fn main() {
let s0 = Series::new("a", &[1i64, 2, 3]);
let s1 = Series::new("b", &[1i64, 1, 1]);
let s2 = Series::new("c", &[Some(2i64), None, None]);
// construct a new ListChunked for a slice of Series.
let list = Series::new("foo", &[s0, s1, s2]);
// construct a few more Series.
let s0 = Series::new("Group", ["A", "B", "A"]);
let s1 = Series::new("Cost", [1, 1, 1]);
let df = DataFrame::new(vec![s0, s1, list]).unwrap();
dbg!(df);
At this stage DF looks like this:
┌───────┬──────┬─────────────────┐
│ Group ┆ Cost ┆ foo │
│ --- ┆ --- ┆ --- │
│ str ┆ i32 ┆ list [i64] │
╞═══════╪══════╪═════════════════╡
│ A ┆ 1 ┆ [1, 2, 3] │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ B ┆ 1 ┆ [1, 1, 1] │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ A ┆ 1 ┆ [2, null, null] │
Question From here, I'd like to get:
┌───────┬──────┬─────┬──────┬──────┐
│ Group ┆ Cost ┆ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i32 ┆ i64 ┆ i64 ┆ i64 │
╞═══════╪══════╪═════╪══════╪══════╡
│ A ┆ 1 ┆ 1 ┆ 2 ┆ 3 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ B ┆ 1 ┆ 1 ┆ 1 ┆ 1 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ A ┆ 1 ┆ 2 ┆ null ┆ null │
So I need something like .explode() but column-wise orient. Is there an existent funciton for this or a workaround potentially?
Many thanks
Yes you can. Via polars lazy, we get access the to the expression API and we can use the list()
namespace, to get elements by index.
let out = df
.lazy()
.select([
all().exclude(["foo"]),
col("foo").list().get(0).alias("a"),
col("foo").list().get(1).alias("b"),
col("foo").list().get(2).alias("c"),
])
.collect()?;
dbg!(out);
┌───────┬──────┬─────┬──────┬──────┐
│ Group ┆ Cost ┆ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i32 ┆ i64 ┆ i64 ┆ i64 │
╞═══════╪══════╪═════╪══════╪══════╡
│ A ┆ 1 ┆ 1 ┆ 2 ┆ 3 │
│ B ┆ 1 ┆ 1 ┆ 1 ┆ 1 │
│ A ┆ 1 ┆ 2 ┆ null ┆ null │
└───────┴──────┴─────┴──────┴──────┘