I am making a Maturin project involving Polars on both the Python and Rust side.
In Python I have a dataframe with columns a
and b
:
import polars as pl
df = pl.DataFrame({'a': [1, 2], 'b': ['foo', 'bar']})
In Rust I have a struct MyStruct
with the fields a
and b
:
struct MyStruct {
a: i64
b: String
}
I would like to convert each row in the dataframe to an instance of MyStruct
, mapping the dataframe to a vector of MyStruct
s. This should be done on the Rust side.
I can get this done on the Python side (assuming MyStruct
is exposed as a pyclass
). First by getting a list of Python dicts and then constructing a Python list of MyStruct
.
df_as_list = df.to_struct'MyStruct').to_list()
[MyStruct(**x) for x in df_as_list]
To spice things up a bit more, imagine that MyStruct
has an enum field instead of a String
field:
enum MyEnum {
Foo
Bar
}
struct MyStruct {
a: i64
b: MyEnum
}
With a suitable function string_to_myenum
that maps strings to MyEnum
(that is, "foo" to Foo
and "bar" to Bar
) it would be great to map the dataframe to the new MyStruct
.
Zip the columns together:
let arr: Vec<MyStruct> = df["a"]
.i64()
.expect("`a` column of wrong type")
.iter()
.zip(df["b"].str().expect("`b` column of wrong type").iter())
.map(|(a, b)| {
Some(MyStruct {
a: a?,
b: b?.to_owned(),
})
})
.collect::<Option<Vec<_>>>()
.expect("found unexpected null");
Note, however, that like I said in the comments, this will be slow, especially for large DataFrame
s. Prefer to do things using the Polars APIs where possible.