[SOLVED] Count minutes per day over index

Count minutes per day over index

I have a dataframe with irregular timestamps in seconds that spans over multiple days, I would like to create a new column and bucket these entries in minute buckets and have an increasing counter in a separate column. So all values that are within one minute should get the same counter value which increases with the number of minutes per day, on a new day the counter should start from 1 again.

                       Value    Counter
2020-01-01 10:00:00      7.       1
2020-01-01 10:00:05      45.       1
2020-01-01 10:00:10      25.       1
2020-01-01 10:02:00      85.       2
2020-01-02 07:00:00      51.       1
2020-01-02 10:00:00      52.       2

I thought about sth like this

df['Counter'] = df.groupby([df.index.dt.day, df.index.dt.minute]).count()

Which does not seem to work.

Solution

You can try .groupby inside .groupby + "ngroup":

df.index = pd.to_datetime(df.index)

df["Counter"] = df.groupby(df.index.date, group_keys=False).apply(
    lambda x: x.groupby([x.index.hour, x.index.minute]).transform("ngroup") + 1
)
print(df)

Prints:

                     Value  Counter
2020-01-01 10:00:00      7        1
2020-01-01 10:00:05     45        1
2020-01-01 10:00:10     25        1
2020-01-01 10:02:00     85        2
2020-01-02 07:00:00     51        1
2020-01-02 10:00:00     52        2