I have a dataframe already sorted by time like this:
| type | time |
| :——- | :——- |
| B | t1 |
| C | t2 |
| A | t3 |
| D | t4 |
| C | t5 |
| B | t6 |
| A | t7 |
| B | t8 |
| D | t9 |
| A | t10 |
I want to be chop the data frame based on the occurrences of A and eventually get the results of:
group 1: {B:t1, C:t2}
group 2: {A:t3, D:t4, C:t5, B:t6}
group 3: {A:t7, B:t8, D:t9}
group 4: {A:t10}
UseGroupBy.apply
with lambda function for dictionaries by helper Series
created compare type
by A
with cumulative sum by Series.cumsum
:
f = lambda x: dict(zip(x['type'], x['time']))
df1 = (df.groupby(df['type'].eq('A').cumsum().add(1).rename('group'))
.apply(f)
.add_prefix('group ')
.reset_index(name='new'))
print (df1)
group new
0 group 1 {'B': 't1', 'C': 't2'}
1 group 2 {'A': 't3', 'D': 't4', 'C': 't5', 'B': 't6'}
2 group 3 {'A': 't7', 'B': 't8', 'D': 't9'}
3 group 4 {'A': 't10'}