I would like to filter an unpivoted LazyDataFrame by using nodejs-polars without having to collect (and lose the LazyDataFrame) in between.
Consider the following example csv's
1.csv:
asset_key
abc
asset_key |
---|
abc |
2.csv:
id;asset_key_1;asset_key_2;asset_key_3
1;123;456;abc
id | asset_key_1 | asset_key_2 | asset_key_3 |
---|---|---|---|
1 | 123 | 456 | abc |
I would first like to unpivot 2.csv, to have all asset_keys available in a new column. Then, I want to filter that column on the value available in 1.csv ("abc"), such that the remaining result after filtering would be:
id | variable | value |
---|---|---|
1 | asset_key_3 | abc |
Instead, I am getting an error
"Error: Not found: value"
If I would collect the LazyDataFrame into a DataFrame after melting and before filtering, it does work. But I would like to know if there is a way to do this without having to give up the LazyDataFrame.
This is the code I use:
import * as pl from 'nodejs-polars';
const df_1: LazyDataFrame = pl.scanCSV('1.csv', { sep: ';' });
const df_2: LazyDataFrame = pl.scanCSV('2.csv', { sep: ';' });
const isInFilter: LazyDataFrame = df_1.select('asset_key');
const df: DataFrame = await df_2
.melt('id', ['asset_key_1', 'asset_key_2', 'asset_key_3'])
.dropNulls()
.filter(pl.col('value').isIn(isInFilter['asset_key']))
.collect();
this does look like a bug not only in nodejs-polars, but in polars as well. I opened up an issue for you! https://github.com/pola-rs/polars/issues/4368