I'm facing with problem using Rust
for converting Polars DataFrame
with string values into ndarray
without One Hot Encoding.
The example of code I used is the following:
println!("{:?}", _df.to_ndarray::<Float64Type>(Default::default()).unwrap());
Is there any solution for that?
I think you can use the apply
method and iterate over each column in the DataFrame and convert it to a numeric representation.so the resulting DataFrame, df_numeric
, will have numeric values instead of strings and finally use the to_ndarray
method to convert the DataFrame to an ndarray, and the resulting ndarray, ndarray
, will have Option
type to handle missing values.
use polars::prelude::*;
use ndarray::prelude::*;
fn main() {
//make a Polars DataFrame with string values
let df = DataFrame::new(vec![
Series::new("col1", &["a", "b", "c"]),
Series::new("col2", &["x", "y", "z"]),
])
.unwrap();
//converting string columns to numeric representation
let df_numeric = df.apply(|s: &Series| s.utf8().unwrap().as_ref().map(|v| v.get(0) as u32));
//converting the DataFrame to an ndarray
let ndarray: Array2<Option<u32>> = df_numeric
.to_ndarray::<UInt32Type>(Default::default())
.unwrap();
println!("{:?}", ndarray);
}