I have some data where I would like to subset between different times of day for different dates, specifically I want the column df['event']
to have a 1
if the time is between 9am to 11am, a 2
if it is between 3pm to 4pm, and 0
otherwise. As such, I tried to use the following line:
df['event'] = np.where(
(
(df['datetime'] >= datetime.strptime(f'{date} 09:00:00', '%Y-%m-%d %H:%M:%S'))
& (df['datetime'] < datetime.strptime(f'{date} 11:00:00', '%Y-%m-%d %H:%M:%S'))
),
1,
np.where(
(
(df['datetime'] >= datetime.strptime(f'{date} 15:00:00', '%Y-%m-%d %H:%M:%S'))
& (df['datetime'] < datetime.strptime(f'{date} 16:00:00', '%Y-%m-%d %H:%M:%S'))
),
2,
0
)
)
This however, doesn't seem to work. I figured that there's a way to do it by indexing on times, but this seems to be quite tricky since this line of code is inside a function, and when I index on times, it seems to chuck up a bunch of errors.
Is there a better way of doing this?
There are 3 values to choose from, so use np.select
instead
import datetime
time1 = pd.to_datetime(df['datetime']).dt.time
# select 1 if time is between 9 and 11
# 2 if it's between 3pm and 4pm
# and 0 otherwise
condlist = [time1.between(datetime.time(9,0,0), datetime.time(11,0,0)),
time1.between(datetime.time(15,0,0), datetime.time(16,0,0))]
# select 1,2 or 0 depending on which condition is satisfied
df['event'] = np.select(condlist, [1, 2], default=0)