In Polars 0.46.0 it works normally:
let df = df!(
"id" => [0, 1, 2, 3, 4],
"col_1" => [1, 2, 3, 4, 5],
"col_2" => [3, 4, 5, 6, 7],
)
.unwrap();
dbg!(&df);
let s = df.column("col_2").unwrap().as_materialized_series();
let combo = df
.clone()
.lazy()
.filter(col("id").is_in(lit(s.clone()), false))
.collect()
.unwrap();
dbg!(&combo);
The same code in Polars 0.50.0 is deprecated:
Deprecation:
is_in
with a collection of the same datatype is ambiguous and deprecated. Please useimplode
to return to previous behavior. See https://github.com/pola-rs/polars/issues/22149 for more information.
How should I write it in Polars 0.50.0, to not get a deprecation warning?
Since the column is already part of the df, you can just use
.filter(col("id").is_in(col("col_2").implode(), false))
If you wanted to do do this with an arbitrary series, you could use
.filter(col("id").is_in(lit(s.clone()).implode(), false))
What implode does is remove the ambiguity that comes from asking if, say, [1,2,3]
is in [1,3,5]
— is this matching row-wise, asking if 1
in [1]
, 2
in [3]
, and 3
in [5]
(true, false, false), or column-wise, asking if 1
in [1,3,5]
, 2
in [1,3,5]
, and 3
in [1,3,5]
(true, false, true). Implode does this by turning a column into a single list, making is_in
operate column-wise (which is what you'd expect, but now it's not ambiguous anymore).