I try the following example in Pandas 2.2.3:
outage_mask = pd.Series(([True]*5 + [False]*5)*5, index=pd.date_range("2025-01-01", freq="1h", periods=50))
[ts for ts in outage_mask.loc[outage_mask.diff().fillna(False)].index]
This gives me the error message:
FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set
pd.set_option('future.no_silent_downcasting', True)
I cannot figure out how to correctly apply this infer_objects
. I assume the problem is that the output of diff
becomes an 'object' dtype due do containing both NaN
s and bool
s, but for example this does not help:
[ts for ts in outage_mask.loc[outage_mask.diff().infer_objects(copy=False).fillna(False)].index]
I can avoid the warning by this clumsy work-around:
[ts for ts in outage_mask.loc[outage_mask.diff().astype(float).fillna(0.).astype(bool)].index]
but I would like to understand how to apply the solution from the warning correctly. How do I do that?
I would use convert_dtypes
here, which will force the nullable boolean pandas dtype on a mix of True/False/NaN:
[
ts
for ts in outage_mask.loc[
outage_mask.diff().convert_dtypes().fillna(False)
].index
]
You actually don't even need the fillna
since a nullable boolean NaN behaves like False
and you could skip the list comprehension:
list(outage_mask.loc[outage_mask.diff().convert_dtypes()].index)
Output:
[Timestamp('2025-01-01 05:00:00'),
Timestamp('2025-01-01 10:00:00'),
Timestamp('2025-01-01 15:00:00'),
Timestamp('2025-01-01 20:00:00'),
Timestamp('2025-01-02 01:00:00'),
Timestamp('2025-01-02 06:00:00'),
Timestamp('2025-01-02 11:00:00'),
Timestamp('2025-01-02 16:00:00'),
Timestamp('2025-01-02 21:00:00')]