pythondataframetransposepython-polars

Polars DataFrame transformation


What is the best way to transform

df = pl.from_repr("""
┌─────┬─────┬─────┐
│ R   ┆ C   ┆ VAL │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 │
╞═════╪═════╪═════╡
│ r1  ┆ c1  ┆ 1   │
│ r1  ┆ c2  ┆ 2   │
│ r2  ┆ c1  ┆ 3   │
│ r2  ┆ c2  ┆ 4   │
│ r3  ┆ c1  ┆ 5   │
│ r3  ┆ c2  ┆ 6   │
└─────┴─────┴─────┘
""")

into:

┌─────┬─────┬─────┐
│ R   ┆ c1  ┆ c2  │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ r1  ┆ 1   ┆ 2   │
│ r2  ┆ 3   ┆ 4   │
│ r3  ┆ 5   ┆ 6   │
└─────┴─────┴─────┘

I was thinking about a a series of group_by operations but there should be a better solution.

Also, how could I get transform back to the original layout again?


Solution

  • You can use the .pivot()

    df.pivot(on='C', index='R', values='VAL')
    
    shape: (3, 3)
    ┌─────┬─────┬─────┐
    │ R   ┆ c1  ┆ c2  │
    │ --- ┆ --- ┆ --- │
    │ str ┆ i64 ┆ i64 │
    ╞═════╪═════╪═════╡
    │ r1  ┆ 1   ┆ 2   │
    │ r2  ┆ 3   ┆ 4   │
    │ r3  ┆ 5   ┆ 6   │
    └─────┴─────┴─────┘
    

    Now turning back into the original is not really an option as we've lost the order, but I would suggest reading about the unpivot method.