pythonpython-polars

How to Filter All Columns in a Polars DataFrame by expression?


I have this example Polars DataFrame:

import polars as pl 

df = pl.DataFrame({
    "id": [1, 2, 3, 4, 5],
    "variable1": [15, None, 5, 10, 20],
    "variable2": [40, 30, 50, 10, None],
}) 

I'm trying to filter all columns of my dataframe using the method pl.all(), and I also tried using pl.any_horizontal() == Condition. However I'm getting the following error:

ComputeError: The predicate passed to 'LazyFrame.filter' expanded to multiple expressions: 

    col("id").is_not_null(),
    col("variable1").is_not_null(),
    col("variable2").is_not_null(),
This is ambiguous. Try to combine the predicates with the 'all' or `any' expression.

Here are my attemps to try to face this.

# Attempt 1:
(
    df
    .filter(
        pl.all().is_not_null()
    )
)
# Attempt 2:
(
    df
    .filter(
        pl.any_horizontal().is_not_null()
    )
)

Desired output, but it's not scalable for bigger DataFrames:

(
    df
    .filter(
        pl.col("variable1").is_not_null(),
        pl.col("variable2").is_not_null()
    )
)

How can I filter all columns in a scalable way without specifying each column individually?


Solution

  • You need to collapse the multiple-generated-expressions (imagine three matrices come out of that first pl.all(), one for each column) into a single column. You can do that with pl.all_horizontal(your, columns, here):

    >>> df.filter(pl.all_horizontal(pl.col('*').is_not_null()))
    shape: (3, 3)
    ┌─────┬───────────┬───────────┐
    │ id  ┆ variable1 ┆ variable2 │
    │ --- ┆ ---       ┆ ---       │
    │ i64 ┆ i64       ┆ i64       │
    ╞═════╪═══════════╪═══════════╡
    │ 1   ┆ 15        ┆ 40        │
    │ 3   ┆ 5         ┆ 50        │
    │ 4   ┆ 10        ┆ 10        │
    └─────┴───────────┴───────────┘