I want to apply comparison operator to Pandas DataFrame. In case the source data value is missing then my result bool table should contain missing values as well. My code:
>>> pd.__version__
'2.1.1'
>>> df = pd.DataFrame([[1.2, 2.2, np.nan], [1.1, np.nan, 3.3]], columns=['A', 'B', 'C'])
>>> res = df.gt(1.0)
>>> res
res
A B C
0 True True False
1 True False True
Attempt to assign NaN value to original missing positions yields FutureWarning
>>> res[df.isna()]=pd.NA
FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'nan' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.
How to get the results correctly without warning?
Note: this question is unrelated as it does not solve entering of missing values.
I think problem is combinations boolean
with missing values, so convert mask to Nullable Boolean:
res = df.gt(1.0).astype('boolean')
res[df.isna()]=pd.NA
print (res)
A B C
0 True True <NA>
1 True <NA> True
res = df.gt(1.0).astype('boolean').mask(df.isna())
print (res)
A B C
0 True True <NA>
1 True <NA> True