I have a dataframe:
df =
No. | Scenario | Exe Seq | Action |
---|---|---|---|
1 | A | 1 | a |
2 | A | 2 | b |
3 | A | 3 | c |
4 | A | 1 | a |
5 | A | 2 | b |
6 | A | 1 | a |
Those are same scenarios, but some reach three, but some stop at two or one. I want to distinguish this.
The "Scenario" values may have values other than "A"
So I will get:
No. | Scenario | Exe Seq | Action | New_Scenario |
---|---|---|---|---|
1 | A | 1 | a | A_1 |
2 | A | 2 | b | A_1 |
3 | A | 3 | c | A_1 |
4 | A | 1 | a | A_2 |
5 | A | 2 | b | A_2 |
6 | A | 1 | a | A_3 |
IIUC use:
#sequence start if consecutive differencies if not 1
df['New_Scenario'] = df['Scenario'] + '_' + df['Exe Seq'].diff().ne(1).cumsum().astype(str)
print (df)
Or:
#sequence start by 1
df['New_Scenario'] = df['Scenario'] + '_' + df['Exe Seq'].eq(1).cumsum().astype(str)
Or maybe:
#sequence start if consecutive differencies if less like 0
df['New_Scenario'] = (df['Scenario'] + '_' +
df['Exe Seq'].diff().fillna(-1).le(0).cumsum().astype(str))