Working with Python, I need to create two new variables.
One (See JourneyID in example) that cummulatively increases by one each time the previous row of another column takes the value '1', and
One (See JourneyN in example) that cummulatively increases by one each time the previous row of another column takes the value '1', but starts over from 1 every time the Respondent ID increases by 1.
m = df['Purpose'] == 1
df.loc[m, 'JourneyID'] = m.cumsum()
Returns df[JourneyID] = [1,1,1,2,1,1,3,1,4] when it should return [1,1,2,2,3,3,3,4,4] for ID.
Any help is greatly appreciated.
Its not super clean, but should get you what you need:
helper = ((df['Purpose']==1).cumsum()+1).shift(1)
helper[0]=1
df['JourneyID'] = helper
JourneyN I did not fully understand :)