I have a dataframe column containing values of 0 and 1. Values of 0 indicate a piece of equipment is offline, while 1 indicates the equipment is running. To calculate the days online between outages, I used:
df['col2'] = df[col1].groupby(df_proc[col1].eq(0).cumsum()).cumcount()
df['col2']
contains a cumulative total of the days online between outages.
Example:
I need to extract the date and the cumumlative total before each of the outages into a separate dataframe. From the example above I would want:
Date | DaysOnline |
---|---|
6/2/2025 | 4 |
6/10/2025 | 5 |
I obviously I am a novice at Python and Stack Overflow.
To extract the date and the cumulative total right before each outage (i.e., where col1
changes from 1
to 0
), we can identify the rows where col1 == 0
, then look at the row just before each of those and get the Date
and col2
values. See demo: https://www.online-python.com/zmTHRWUfKr
import pandas as pd
data = {
"Date": [
"5/29/2025", "5/31/2025", "6/1/2025", "6/2/2025",
"6/3/2025", "6/4/2025", "6/5/2025", "6/6/2025",
"6/7/2025", "6/8/2025", "6/9/2025", "6/10/2025"
],
"col1": [1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1],
"col2": [1, 2, 3, 4, 0, 0, 0, 1, 2, 3, 4, 5]
}
df = pd.DataFrame(data)
df["Date"] = pd.to_datetime(df["Date"])
# Find where a 1 is followed by a 0 (transition point)
transition_idx = df[(df["col1"] == 1) & (df["col1"].shift(-1) == 0)].index
result = df.loc[transition_idx, ["Date", "col2"]].rename(columns={"col2": "DaysOnline"})
# If the last row is 1, include it as well
if df["col1"].iloc[-1] == 1:
result = pd.concat([
result,
df.iloc[[-1]][["Date", "col2"]].rename(columns={"col2": "DaysOnline"})
])
print(result)