I can use drop_nans()
function to remove rows with some or all columns set as nan
.
Is there an equivalent function for dropping rows with all columns having value 0?
import polars as pl
df = pl.DataFrame({"a":[0, 0, 0, 0, 30],
"b":[0, 0, 0, 0, 40],
"c":[0, 0, 0, 0, 50]})
>>> df
a b c
i64 i64 i64
------------------------
0 0 0
0 0 0
0 0 0
0 0 0
30 40 50
In this example, I would like to drop the first 4 rows from the dataframe.
You can remove
with pl.all_horizontal
, applied to the condition on pl.all
:
df.remove(pl.all_horizontal(pl.all() == 0))
Output:
shape: (1, 3)
┌─────┬─────┬─────┐
│ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 30 ┆ 40 ┆ 50 │
└─────┴─────┴─────┘
Also possible via filter
on the inverse:
df.filter(~pl.all_horizontal(pl.all() == 0))
# same output
To remove rows with at least one 0, use pl.any_horizontal
:
df = pl.DataFrame({"a":[1, 0, 0, 0, 30],
"b":[0, 1, 0, 0, 40],
"c":[0, 0, 1, 0, 50]})
df.remove(pl.any_horizontal(pl.all() == 0))
# same output