pythonpandasdatetimeconditional-statementsswitch-statement

How to assign values based on time ranges in a dataframe column


I have some data where I would like to subset between different times of day for different dates, specifically I want the column df['event'] to have a 1 if the time is between 9am to 11am, a 2 if it is between 3pm to 4pm, and 0 otherwise. As such, I tried to use the following line:

df['event'] = np.where(
    (
        (df['datetime'] >= datetime.strptime(f'{date} 09:00:00', '%Y-%m-%d %H:%M:%S'))
        & (df['datetime'] < datetime.strptime(f'{date} 11:00:00', '%Y-%m-%d %H:%M:%S'))
    ),
    1,
    np.where(
        (
            (df['datetime'] >= datetime.strptime(f'{date} 15:00:00', '%Y-%m-%d %H:%M:%S'))
            & (df['datetime'] < datetime.strptime(f'{date} 16:00:00', '%Y-%m-%d %H:%M:%S'))
        ),
        2,
        0
    )
)

This however, doesn't seem to work. I figured that there's a way to do it by indexing on times, but this seems to be quite tricky since this line of code is inside a function, and when I index on times, it seems to chuck up a bunch of errors.

Is there a better way of doing this?


Solution

  • There are 3 values to choose from, so use np.select instead

    import datetime
    time1 = pd.to_datetime(df['datetime']).dt.time
    # select 1 if time is between 9 and 11
    # 2 if it's between 3pm and 4pm
    # and 0 otherwise
    condlist = [time1.between(datetime.time(9,0,0), datetime.time(11,0,0)), 
                time1.between(datetime.time(15,0,0), datetime.time(16,0,0))]
    # select 1,2 or 0 depending on which condition is satisfied
    df['event'] = np.select(condlist, [1, 2], default=0)