fileiorust

What is the most efficient way to read a large file in chunks without loading the entire file in memory at once?


What is the most efficient general purpose way of reading "large" files (which may be text or binary), without going into unsafe territory? I was surprised how few relevant results there were when I did a web search for "rust read large file in chunks".

For example, one of my use cases is to calculate an MD5 checksum for a file using rust-crypto (the Md5 module allows you to add &[u8] chunks iteratively).

Here is what I have, which seems to perform slightly better than some other methods like read_to_end:

use std::{
    fs::File,
    io::{self, BufRead, BufReader},
};

fn main() -> io::Result<()> {
    const CAP: usize = 1024 * 128;
    let file = File::open("my.file")?;
    let mut reader = BufReader::with_capacity(CAP, file);

    loop {
        let length = {
            let buffer = reader.fill_buf()?;
            // do stuff with buffer here
            buffer.len()
        };
        if length == 0 {
            break;
        }
        reader.consume(length);
    }

    Ok(())
}

Solution

  • I don't think you can write code more efficient than that. fill_buf on a BufReader over a File is basically just a straight call to read(2).

    That said, BufReader isn't really a useful abstraction when you use it like that; it would probably be less awkward to just call file.read(&mut buf) directly.