pythonpandastime-seriesdate-range

How to represent ranges of time in a pandas index


I have a collection of user data as follows:

user start end
John Doe 2025-03-21 11:30:35 2025-03-21 13:05:26
... ... ...
Jane Doe 2023-12-31 01:02:03 2024-01-02 03:04:05

Each user has a start and end datetime of some activity. I would like to place this temporal range in the index so I can quickly query the dataframe to see which users were active during a certain date/time range like so:

df['2024-01-01:2024-01-31']

Pandas has Period objects, but these seem to only support a specific year, day, or minute, not an arbitrary start and end datetime. Pandas also has MultiIndex indices, but these seem to be designed for hierarchical categorical labels, not for time ranges. Any other ideas for how to represent this time range in an index?


Solution

  • Here is your solution:

    import pandas as pd
    
    data = {
        'user': ['John Doe', 'Jane Doe'],
        'start': [pd.Timestamp('2025-03-21 11:30:35'), pd.Timestamp('2023-12-31 01:02:03')],
        'end': [pd.Timestamp('2025-03-21 13:05:26'), pd.Timestamp('2024-01-02 03:04:05')],
    }
    df = pd.DataFrame(data)
    
    interval_index = pd.IntervalIndex.from_arrays(df['start'], df['end'], closed='both')
    df.set_index(interval_index, inplace=True)
    df.drop(columns=['start', 'end'], inplace=True)
    # check user
    query_time = pd.Timestamp("2024-01-01 12:00:00")
    active_users = df[df.index.contains(query_time)]
    print(active_users)
    

    Output:

    D:\python>python test.py
                                                    user
    [2023-12-31 01:02:03, 2024-01-02 03:04:05]  Jane Doe