rustrust-polarsmaturin

Convert Polars dataframe to vector of structs


I am making a Maturin project involving Polars on both the Python and Rust side.

In Python I have a dataframe with columns a and b:

import polars as pl
df = pl.DataFrame({'a': [1, 2], 'b': ['foo', 'bar']})

In Rust I have a struct MyStruct with the fields a and b:

struct MyStruct {
  a: i64
  b: String
}

I would like to convert each row in the dataframe to an instance of MyStruct, mapping the dataframe to a vector of MyStructs. This should be done on the Rust side.

I can get this done on the Python side (assuming MyStruct is exposed as a pyclass). First by getting a list of Python dicts and then constructing a Python list of MyStruct.

df_as_list = df.to_struct'MyStruct').to_list()
[MyStruct(**x) for x in df_as_list]

To spice things up a bit more, imagine that MyStruct has an enum field instead of a String field:

enum MyEnum {
  Foo
  Bar
}
struct MyStruct {
  a: i64
  b: MyEnum
}

With a suitable function string_to_myenum that maps strings to MyEnum (that is, "foo" to Foo and "bar" to Bar) it would be great to map the dataframe to the new MyStruct.


Solution

  • Zip the columns together:

    let arr: Vec<MyStruct> = df["a"]
        .i64()
        .expect("`a` column of wrong type")
        .iter()
        .zip(df["b"].str().expect("`b` column of wrong type").iter())
        .map(|(a, b)| {
            Some(MyStruct {
                a: a?,
                b: b?.to_owned(),
            })
        })
        .collect::<Option<Vec<_>>>()
        .expect("found unexpected null");
    

    Note, however, that like I said in the comments, this will be slow, especially for large DataFrames. Prefer to do things using the Polars APIs where possible.