I am trying to create a small machine learning application. I am reading data from a csv and I have converted it into a DMatrix from the nalgebra library. To split the dataset into the training and test subset I would like to take advantage of the smartcore function train_test_split
.
I am having problems when using this function with the DMatrix generated from the csv. Could you tell me why they happen and how I could solve them?
Here is the code:
use std::error::Error;
use std::io::BufReader;
use std::io::BufRead;
use std::fs::File;
use nalgebra::DMatrix;
use std::str::FromStr;
use smartcore::model_selection::train_test_split;
fn read_csv(input: &mut dyn BufRead) -> Result<DMatrix<f64>, Box<dyn Error>> {
let mut samples = Vec::new();
let mut rows = 0;
for line in input.lines().skip(1){
rows += 1;
for data in line?.split_terminator(",") {
let a = f64::from_str(data.trim());
match a {
Ok(value) => samples.push(value),
Err(e) => println!("Error parsing data in row: {}", rows),
}
}
}
let cols = samples.len() / rows;
Ok(DMatrix::from_row_slice(rows, cols, &samples[..]))
}
fn main() -> Result<(), Box<dyn Error>> {
//Load CSV
let file = File::open("dataset/heart.csv").unwrap();
let data: DMatrix<f64> = read_csv(&mut BufReader::new(file)).unwrap();
let x = data.columns(0, 13).into_owned();
let y = data.column(13).into_owned();
// ERROR
let (x_train, x_test, y_train, y_test) = train_test_split(&x, &y.transpose(), 0.2, true);
println!("{:?}", x_train);
Ok(())
}
Here is the error I get back:
error[E0277]: the trait bound `nalgebra::Matrix<f64, Dyn, Dyn, VecStorage<f64, Dyn, Dyn>>: smartcore::linalg::Matrix<_>` is not satisfied
--> src/main.rs:53:63
|
53 | let (x_train, x_test, y_train, y_test) = train_test_split(&x, &y.transpose(), 0.2, true);
| ---------------- ^^ the trait `smartcore::linalg::Matrix<_>` is not implemented for `nalgebra::Matrix<f64, Dyn, Dyn, VecStorage<f64, Dyn, Dyn>>`
| |
| required by a bound introduced by this call
|
= help: the following other types implement trait `smartcore::linalg::Matrix<T>`:
DenseMatrix<T>
nalgebra::base::matrix::Matrix<T, nalgebra::base::dimension::Dynamic, nalgebra::base::dimension::Dynamic, nalgebra::base::vec_storage::VecStorage<T, nalgebra::base::dimension::Dynamic, nalgebra::base::dimension::Dynamic>>
note: required by a bound in `train_test_split`
|
133 | pub fn train_test_split<T: RealNumber, M: Matrix<T>>(
| ^^^^^^^^^ required by this bound in `train_test_split`
error[E0277]: the trait bound `nalgebra::Matrix<f64, Dyn, Dyn, VecStorage<f64, Dyn, Dyn>>: BaseMatrix<_>` is not satisfied
--> src/main.rs:53:46
|
53 | let (x_train, x_test, y_train, y_test) = train_test_split(&x, &y.transpose(), 0.2, true);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `BaseMatrix<_>` is not implemented for `nalgebra::Matrix<f64, Dyn, Dyn, VecStorage<f64, Dyn, Dyn>>`
|
= help: the following other types implement trait `BaseMatrix<T>`:
DenseMatrix<T>
nalgebra::base::matrix::Matrix<T, nalgebra::base::dimension::Dynamic, nalgebra::base::dimension::Dynamic, nalgebra::base::vec_storage::VecStorage<T, nalgebra::base::dimension::Dynamic, nalgebra::base::dimension::Dynamic>>
For more information about this error, try `rustc --explain E0277`.
error: could not compile `logistic-regression` due to 2 previous errors
The issue is that smartcore@0.2.0
depends on nalgebra@0.23.2
so it only implemented it's traits for nalgebra
types of that version. Until there's an updated smartcore
that depends on a more recent version of nalgebra
you'll have to downgrade to that same nalgebra
version:
[dependencies]
nalgebra = "0.23.2"
smartcore = "0.2.0"