In pandas, one can do:
import pandas as pd
d = {"foo":[1,2,3, None], "bar":[4,None, None, 6]}
df_pandas = pd.DataFrame.from_dict(d)
dict(df_pandas.isnull().sum())
[out]:
{'foo': 1, 'bar': 2}
In polars it's possible to do the same by looping through the columns:
import polars as pl
d = {"foo":[1,2,3, None], "bar":[4,None, None, 6]}
df_polars = pl.from_dict(d)
{col:df_polars[col].is_null().sum() for col in df_polars.columns}
Looping through the columns in polars is particularly painful when using LazyFrame, then the .collect() has to be done in chunks to do the aggregation.
Is there a way to find no. of nulls in every column in a polars dataframe without looping through each columns?
Assuming you're not married to the output format the idiomatic way to do it is...
df.select(pl.all().is_null().sum())
However if you really like the dict output you can easily get it...
df.select(pl.all().is_null().sum()).to_dicts()[0]
The way this works is that inside the select we start with pl.all() which means all of the columns and then, much like in the pandas version, we apply is_null which would return True/False. From that we chain sum which turns the Trues into 1s and gives you the number of nulls in each column.
There's also the dedicated null_count() so you don't have to chain is_null().sum() thanks to @jqurious for that tip.