I wrote code where:
However I have the following warning: A value is trying to be set on a copy of a slice from a DataFrame
. And I'm not sure how I could avoid this warning.
Current code:
import pandas as pd
values = range(6)
df = pd.DataFrame({"Values":values, "Caution limit": [1]*len(values), "Fail limit": [3]*len(values)})
df["Status"] = "Pass"
df["Status"][df["Caution limit"] < df["Values"]] = "Caution"
df["Status"][df["Fail limit"] < df["Values"]] = "Fail"
Current Output:
Values | Caution limit | Fail limit | Status |
---|---|---|---|
0 | 1 | 3 | Pass |
1 | 1 | 3 | Pass |
2 | 1 | 3 | Caution |
3 | 1 | 3 | Caution |
4 | 1 | 3 | Fail |
5 | 1 | 3 | Fail |
Warning message:
C:\.....py:5: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df["Status"][df["Caution limit"] < df["Values"]] = "Caution"
C:\.....py:6: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df["Status"][df["Fail limit"] < df["Values"]] = "Fail"
UPD: I found a way avoiding the SettingWithCopyWarning using lambda function:
update_status = lambda value, caution, failed: ["Fail" if f<v else "Caution" if c<v else "Pass" for v,c,f in zip(value, caution, failed)]
df["Status"] = update_status(df["Values"],df["Caution limit"],df["Fail limit"])
However it completely avoids using pandas
capability and my main goal is to learn how I could use pandas
to do so.
You're modifying a slice of the original dataframe, and not a copy of the data, which may be unclear in some situations. It's not an error in the Pandas version you're using, so you just get a warning. If you don't upgrade Pandas, you shouldn't have a problem with your current code.
In more recent versions of Pandas 2, there is a new warning referring to a future change in Pandas 3, when the code you're using will actually be modifying a copy of the original data -- and your code will not work anymore. Read more here.
To avoid the warning and possible future errors, you can modify the dataframe in place with loc
:
df.loc[df["Caution limit"] < df["Values"], "Status"] = "Caution"
df.loc[df["Fail limit"] < df["Values"], "Status"] = "Fail"
Or using mask
as an alternative:
df["Status"] = df["Status"].mask(df["Caution limit"] < df["Values"], "Caution")
df["Status"] = df["Status"].mask(df["Fail limit"] < df["Values"], "Fail")
Both will output the following with no warnings:
Values Caution limit Fail limit Status
0 0 1 3 Pass
1 1 1 3 Pass
2 2 1 3 Caution
3 3 1 3 Caution
4 4 1 3 Fail
5 5 1 3 Fail