I want to divide a large dataset into 12-hour periods of data starting/ending at midnight and noon each day.
I was planning to use Pandas.Period for this but I noticed that it converts an arbitrary datetime to a 12-hour period beginning in the current hour, whereas what I want is the 12-hour period starting at 00:00 or 12:00 hours.
import pandas as pd
dt = pd.to_datetime("2025-04-17 18:35")
current_period = dt.to_period(freq='12h')
print(current_period)
Output:
2025-04-17 18:00
What I want is the following period:
2025-04-17 12:00
Here is the full code:
import pandas as pd
def get_12h_period(dt):
# Determine if the time is in AM (00:00-11:59) or PM (12:00-23:59)
if dt.hour < 12:
return pd.Period(year=dt.year, month=dt.month, day=dt.day,
hour=0, freq='12h')
else:
return pd.Period(year=dt.year, month=dt.month, day=dt.day,
hour=12, freq='12h')
dt = pd.to_datetime("2025-04-17 18:35")
period = get_12h_period(dt)
print(period)
Another solutin using floor and period:
import pandas as pd
def get_12h_period(dt):
# Floor to the nearest 12-hour block (00:00 or 12:00)
floored = dt.floor('12h', ambiguous='infer')
# Adjust if floored incorrectly (e.g., 18:35 -> 12:00, not 18:00)
if floored.hour == 0 and dt.hour >= 12:
floored += pd.Timedelta(hours=12)
return floored.to_period('12h')
dt = pd.to_datetime("2025-04-17 18:35")
print(get_12h_period(dt))
For large dataset, this solution could be the best one:
import pandas as pd
def get_12h_period(dt_index):
# Floor to nearest 12H, then adjust misaligned times (e.g., 18:00 → 12:00)
floored = dt_index.floor('12h')
mask = (dt_index.hour >= 12) & (floored.hour != 12)
floored = floored.where(~mask, floored - pd.Timedelta(hours=12))
return floored.to_period('12h')
dt_index = pd.to_datetime(["2025-04-17 18:35"])
print(get_12h_period(dt_index))
Output:
2025-04-17 12:00