[SOLVED] Explode and duplicate rows in df and create a new column with the correct count of the values

Explode and duplicate rows in df and create a new column with the correct count of the values

In input, I have this dataframe :

numberType	NumberInst	Type
1	None	Car
2	1	Bus
3	1	Plane

I did that to explode and duplicate the rows :

df= df.with_columns(pl.col("NumberInst").alias("numberInstExclude"))
df= df.select(pl.exclude("numberInstExclude").repeat_by("numberType").explode())

The result, I want to have is :

numberType	NumberInst	Type	NumberInstRep
1	None	Car	1
2	1	Bus	1
2	1	Bus	2
3	1	Plane	1
3	1	Plane	2
3	1	Plane	3

If you have an idea, how could I get "NumberInstRep". What I want to do is :

if NumberInst is None then NumberInstRep = 1
else use numberType to have NumberInstRep

Thank you.

Solution

You can use pl.int_ranges() to create lists with the right number of repetitions, then df.explode() it

>>> df = pl.DataFrame({'type': [1, 2, 3], 'inst': [None, 1, 1], 'category': ['car', 'bus', 'plane']})
>>> df.with_columns(pl.int_ranges(1, 1+pl.col('type')).alias('rep')).explode('rep')
shape: (6, 4)
┌──────┬──────┬──────────┬─────┐
│ type ┆ inst ┆ category ┆ rep │
│ ---  ┆ ---  ┆ ---      ┆ --- │
│ i64  ┆ i64  ┆ str      ┆ i64 │
╞══════╪══════╪══════════╪═════╡
│ 1    ┆ null ┆ car      ┆ 1   │
│ 2    ┆ 1    ┆ bus      ┆ 1   │
│ 2    ┆ 1    ┆ bus      ┆ 2   │
│ 3    ┆ 1    ┆ plane    ┆ 1   │
│ 3    ┆ 1    ┆ plane    ┆ 2   │
│ 3    ┆ 1    ┆ plane    ┆ 3   │
└──────┴──────┴──────────┴─────┘