I'm playing around with the pipe |
and ampersand &
operators, as well as the .gt()
and .lt()
built-in functions to see how they work together.
I'm looking at a column in a DataFrame with values from 0.00 to 1.00.
I can use the >
, <
, and &
operators together and find no problem, same with using .gt()
, .lt()
, and &
. However, if I try to chain .gt().lt()
it gives another result.
In my example I'm using .gt(0.7).lt(0.9)
, but this yields values <=0.7. If I change the order to .lt(0.9).gt(0.7)
, I get values <=0.9.
I can always just write it like this df['column'].gt(0.7)&df['column'].lt(0.9)
, just wondering if there's a way of chaining .gt().lt()
The misunderstanding is that in Python True == 1
and False == 0
(see bool). Suppose we have:
import pandas as pd
data = {'col': [0.5, 0.8, 1]}
df = pd.DataFrame(data)
df['col'].gt(0.7)
When we chain .lt(0.9)
, this check takes place on the result of .gt(0.7)
:
0 False # 0 < 0.9 (True)
1 True # 1 < 0.9 (False)
2 True # 1 < 0.9 (False)
Name: col, dtype: bool
Use Series.between
instead, with inclusive
to control the comparison operators:
df['col'].between(0.7, 0.9, inclusive='neither')
0 False # 0.5
1 True # 0.8
2 False # 1
Name: col, dtype: bool