I have the following dataframe:
import pandas as pd
import csv
lst = [['SPXW 250715C06310000', '7/14/2025', 2.74, 2.87, 2.60, 2.65, 14, '8:30:00'],
['SPXW 250715C06310000', '7/14/2025', 2.80, 2.80, 2.50, 2.53, 61, '8:31:00'],
['SPXW 250715C06310000', '7/14/2025', 2.45, 2.45, 2.45, 2.45, 2, '8:32:00'],
['SPXW 250715C06310000', '7/14/2025', 2.58, 2.80, 2.58, 2.60, 32, '8:33:00'],
['SPXW 250715C06310000', '7/14/2025', 2.50, 2.50, 2.25, 2.30, 5, '8:34:00'],
['SPXW 250709C06345000', '7/9/2025', 0.05, 0.05, 0.03, 0.03, 246, '8:30:00'],
['SPXW 250709C06345000', '7/9/2025', 0.05, 0.10, 0.03, 0.07, 452, '8:31:00'],
['SPXW 250709C06345000', '7/9/2025', 0.07, 0.10, 0.05, 0.07, 137, '8:32:00'],
['SPXW 250709C06345000', '7/9/2025', 0.07, 0.07, 0.07, 0.07, 5, '8:33:00'],
['SPXW 250709C06345000', '7/9/2025', 0.07, 0.07, 0.05, 0.05, 225, '8:34:00'],
['SPXW 250715C06310000', '7/11/2025', 7.30, 7.30, 7.30, 7.30, 2, '8:30:00'],
['SPXW 250715C06310000', '7/11/2025', 7.20, 7.20, 7.20, 7.20, 2, '8:31:00'],
['SPXW 250715C06310000', '7/11/2025', 6.92, 6.92, 6.92, 6.92, 20, '8:32:00'],
['SPXW 250715C06310000', '7/11/2025', 6.58, 6.58, 6.58, 6.58, 1, '8:34:00'],
['SPXW 250715C06310000', '7/11/2025', 6.41, 6.41, 6.41, 6.41, 2, '8:35:00']
]
df = pd.DataFrame(lst, columns = ['Symbol', 'Date', 'open', 'high', 'low', 'close', 'volume', 'hour'])
df['Date'] = pd.to_datetime(df['Date'])
df['hour'] = pd.to_datetime(df['hour'], format='%H:%M:%S')
df.index = pd.MultiIndex.from_arrays([df['Symbol'], df['Date']], names=['Symbol', 'Date'])
df.drop(columns=['Symbol', 'Date'], inplace=True)
When run, it yields this:
open high low close volume hour day_high day_low
Symbol Date
SPXW 250715C06310000 2025-07-14 2.74 2.87 2.60 2.65 14 1900-01-01 08:30:00
2025-07-14 2.80 2.80 2.50 2.53 61 1900-01-01 08:31:00
2025-07-14 2.45 2.45 2.45 2.45 2 1900-01-01 08:32:00
2025-07-14 2.58 2.80 2.58 2.60 32 1900-01-01 08:33:00
2025-07-14 2.50 2.50 2.25 2.30 5 1900-01-01 08:34:00
SPXW 250709C06345000 2025-07-09 0.05 0.05 0.03 0.03 246 1900-01-01 08:30:00
2025-07-09 0.05 0.10 0.03 0.07 452 1900-01-01 08:31:00
2025-07-09 0.07 0.10 0.05 0.07 137 1900-01-01 08:32:00
2025-07-09 0.07 0.07 0.07 0.07 5 1900-01-01 08:33:00
2025-07-09 0.07 0.07 0.05 0.05 225 1900-01-01 08:34:00
SPXW 250715C06310000 2025-07-11 7.30 7.30 7.30 7.30 2 1900-01-01 08:30:00
2025-07-11 7.20 7.20 7.20 7.20 2 1900-01-01 08:31:00
2025-07-11 6.92 6.92 6.92 6.92 20 1900-01-01 08:32:00
2025-07-11 6.58 6.58 6.58 6.58 1 1900-01-01 08:34:00
2025-07-11 6.41 6.41 6.41 6.41 2 1900-01-01 08:35:00
This is a fragment of a dataframe that has 820,000 lines. These are options contracts trading over approximately two weeks, with 1 minute bars. I have set up a multiindex with Symbol and Date because it just makes sense to me to filter data that way.
What I want to do, is find max value for a contract on a given day, and append it to the column 'day_high' where the time in 'hour' column is equal to 08:30:00.
I've tried the following line of code:
df.loc[('SPXW 250715C06310000', '2025-07-14'), 'day_high'] = df.loc[('SPXW 250715C06310000', '2025-07-14')]['high'].max()
This line of code puts the max value in all rows. I've tried a few different modifiers to stick specifically in the row where hour = 8:30:00, but I've had no luck.
So... here are my questions:
How can I assign the max, or whatever value from a given day to a different column where 'hour'=8:30:00?
Same as question #2, but for all contracts in the dataframe. Will I need to set up a for loop or is there an idiomatic way of doing this with pandas?
TIA
You need assign by mask for 8:30
and MultiIndex
:
#simplier set MutiIndex
df['Date'] = pd.to_datetime(df['Date'])
df['hour'] = pd.to_datetime(df['hour'], format='%H:%M:%S')
df = df.set_index(['Symbol', 'Date'])
mask = df['hour'].dt.strftime('%H:%M').eq('08:30') &
(df.index == ('SPXW 250715C06310000', pd.Timestamp('2025-07-14')))
df.loc[mask, 'day_high'] = df.loc[('SPXW 250715C06310000', '2025-07-14')]['high'].max()
print (df)
open high low close volume \
Symbol Date
SPXW 250715C06310000 2025-07-14 2.74 2.87 2.60 2.65 14
2025-07-14 2.80 2.80 2.50 2.53 61
2025-07-14 2.45 2.45 2.45 2.45 2
2025-07-14 2.58 2.80 2.58 2.60 32
2025-07-14 2.50 2.50 2.25 2.30 5
SPXW 250709C06345000 2025-07-09 0.05 0.05 0.03 0.03 246
2025-07-09 0.05 0.10 0.03 0.07 452
2025-07-09 0.07 0.10 0.05 0.07 137
2025-07-09 0.07 0.07 0.07 0.07 5
2025-07-09 0.07 0.07 0.05 0.05 225
SPXW 250715C06310000 2025-07-11 7.30 7.30 7.30 7.30 2
2025-07-11 7.20 7.20 7.20 7.20 2
2025-07-11 6.92 6.92 6.92 6.92 20
2025-07-11 6.58 6.58 6.58 6.58 1
2025-07-11 6.41 6.41 6.41 6.41 2
hour day_high
Symbol Date
SPXW 250715C06310000 2025-07-14 1900-01-01 08:30:00 2.87
2025-07-14 1900-01-01 08:31:00 NaN
2025-07-14 1900-01-01 08:32:00 NaN
2025-07-14 1900-01-01 08:33:00 NaN
2025-07-14 1900-01-01 08:34:00 NaN
SPXW 250709C06345000 2025-07-09 1900-01-01 08:30:00 NaN
2025-07-09 1900-01-01 08:31:00 NaN
2025-07-09 1900-01-01 08:32:00 NaN
2025-07-09 1900-01-01 08:33:00 NaN
2025-07-09 1900-01-01 08:34:00 NaN
SPXW 250715C06310000 2025-07-11 1900-01-01 08:30:00 NaN
2025-07-11 1900-01-01 08:31:00 NaN
2025-07-11 1900-01-01 08:32:00 NaN
2025-07-11 1900-01-01 08:34:00 NaN
2025-07-11 1900-01-01 08:35:00 NaN
For second use GroupBy.max
:
mask = df['hour'].dt.strftime('%H:%M').eq('08:30')
df.loc[mask, 'day_high'] = df.groupby(['Symbol','Date'])['high'].max()
Or GroupBy.transform
with Series.where
for set only rows with 8:30
:
mask = df['hour'].dt.strftime('%H:%M').eq('08:30')
df['day_high'] = df.groupby(['Symbol','Date'])['high'].transform('max').where(mask)
print (df)
open high low close volume \
Symbol Date
SPXW 250715C06310000 2025-07-14 2.74 2.87 2.60 2.65 14
2025-07-14 2.80 2.80 2.50 2.53 61
2025-07-14 2.45 2.45 2.45 2.45 2
2025-07-14 2.58 2.80 2.58 2.60 32
2025-07-14 2.50 2.50 2.25 2.30 5
SPXW 250709C06345000 2025-07-09 0.05 0.05 0.03 0.03 246
2025-07-09 0.05 0.10 0.03 0.07 452
2025-07-09 0.07 0.10 0.05 0.07 137
2025-07-09 0.07 0.07 0.07 0.07 5
2025-07-09 0.07 0.07 0.05 0.05 225
SPXW 250715C06310000 2025-07-11 7.30 7.30 7.30 7.30 2
2025-07-11 7.20 7.20 7.20 7.20 2
2025-07-11 6.92 6.92 6.92 6.92 20
2025-07-11 6.58 6.58 6.58 6.58 1
2025-07-11 6.41 6.41 6.41 6.41 2
hour day_high
Symbol Date
SPXW 250715C06310000 2025-07-14 1900-01-01 08:30:00 2.87
2025-07-14 1900-01-01 08:31:00 NaN
2025-07-14 1900-01-01 08:32:00 NaN
2025-07-14 1900-01-01 08:33:00 NaN
2025-07-14 1900-01-01 08:34:00 NaN
SPXW 250709C06345000 2025-07-09 1900-01-01 08:30:00 0.10
2025-07-09 1900-01-01 08:31:00 NaN
2025-07-09 1900-01-01 08:32:00 NaN
2025-07-09 1900-01-01 08:33:00 NaN
2025-07-09 1900-01-01 08:34:00 NaN
SPXW 250715C06310000 2025-07-11 1900-01-01 08:30:00 7.30
2025-07-11 1900-01-01 08:31:00 NaN
2025-07-11 1900-01-01 08:32:00 NaN
2025-07-11 1900-01-01 08:34:00 NaN
2025-07-11 1900-01-01 08:35:00 NaN