I have a dataframe with irregular timestamps in seconds that spans over multiple days, I would like to create a new column and bucket these entries in minute buckets and have an increasing counter in a separate column. So all values that are within one minute should get the same counter value which increases with the number of minutes per day, on a new day the counter should start from 1 again.
Value Counter
2020-01-01 10:00:00 7. 1
2020-01-01 10:00:05 45. 1
2020-01-01 10:00:10 25. 1
2020-01-01 10:02:00 85. 2
2020-01-02 07:00:00 51. 1
2020-01-02 10:00:00 52. 2
I thought about sth like this
df['Counter'] = df.groupby([df.index.dt.day, df.index.dt.minute]).count()
Which does not seem to work.
You can try .groupby
inside .groupby
+ "ngroup"
:
df.index = pd.to_datetime(df.index)
df["Counter"] = df.groupby(df.index.date, group_keys=False).apply(
lambda x: x.groupby([x.index.hour, x.index.minute]).transform("ngroup") + 1
)
print(df)
Prints:
Value Counter
2020-01-01 10:00:00 7 1
2020-01-01 10:00:05 45 1
2020-01-01 10:00:10 25 1
2020-01-01 10:02:00 85 2
2020-01-02 07:00:00 51 1
2020-01-02 10:00:00 52 2