I have some data that looks something like this:
date_time, user, page
12345, A, index
13456, A, index
14566, B, home
...
I'd like to store the index of each row (i.e., its order when sorted by date_time), both overall, and per page.
Overall is simple. Just something like:
df['overall_count'] = range(len(df))
But I can't figure out how to do it for the pages. The following code gets me what I want, but it's connected to the groupby object, and I can't figure out how to move it to the main dataframe.
grouped = df.groupby('page')
for name, group in grouped:
group = group.sort_values('date_time')
group['page_count'] = range(len(group))
If you want to assign group-wise indices, you can use cumcount:
df.groupby('page').cumcount()