I have a very large file generated by other tools, but I don't need all the information, only a few columns of information are enough. When I use Python pandas to read, I can specify the required columns, but I don't know how Rust implements it.
Thanks.
I hope Rust can achieve the same functionality as Python pandas.
data = pd.read_csv(file, sep='\t', header=None, usecols=[0,1,5])
I am assuming that you want to use rust-polars
. This is how you could achieve the same using rust.
I have also added the comment to understand what's going to with each steps.
use polars::prelude::*;
fn main() {
let path = "<PATH_TO_THE_FILE>";
// If there are no headers, polars automatically choose "column_1, column_2 etc"
let columns_to_select = ["column_1".into(), "column_2".into()];
let df = CsvReadOptions::default()
.with_has_header(false) // equivalent to `header=None` in pandas
.map_parse_options(|parse_options| parse_options.with_separator(b'\t')) // use custom separator. equivalent to `sep=\t` in pandas
.with_columns(Some(Arc::new(columns_to_select))) // select the columns. equivalent to `usecols=[1, 2]` in pandas
.try_into_reader_with_file_path(Some(path.into())) // specify the file path
.unwrap()
.finish()
.unwrap();
println!("{:?}", df);
}
Or use with_projection
method if you want to select the columns based on index. For example, .with_projection(Some(Arc::new(vec![0, 1])))
will select the first and second column.