I have the following Dataframe:
PplNum RoomNum Value
0 1 0 265
1 1 12 170
2 2 0 297
3 2 12 85
4 2 0 41
5 2 12 144
Generally the PplNum
and RoomNum
is generated like this, and it will always follow this format:
for ppl in [1,2,2]:
for room in [0, 12]:
print(ppl, room)
1 0
1 12
2 0
2 12
2 0
2 12
But now what I would like to achieve is to mark those duplicates combinations of PplNum
and RoomNum
so that I can know which combinationss are the first occurrence, which are the second occurrence and so on... So the expected output Dataframe will be like this:
PplNum RoomNum Value C
0 1 0 265 1
1 1 12 170 1
2 2 0 297 1
3 2 12 85 1
4 2 0 41 2
5 2 12 144 2
you can do it using groupby() together with cumcount() function:
In [102]: df['C'] = df.groupby(['PplNum','RoomNum']).cumcount() + 1
In [103]: df
Out[103]:
PplNum RoomNum Value C
0 1 0 265 1
1 1 12 170 1
2 2 0 297 1
3 2 12 85 1
4 2 0 41 2
5 2 12 144 2
Explanation:
In [101]: df.groupby(['PplNum','RoomNum']).cumcount() + 1
Out[101]:
0 1
1 1
2 1
3 1
4 2
5 2
dtype: int64