performancerustcasting

Unexpected low latency when casting


This is not a problem but puzzle me a litte bit, I have this code in Rust:

fn without_cast() -> i32 {
    let a = 10;
    return a;
}

fn with_cast() -> i32 {
    let a = 10.0;
    return a as i32;
}

fn main() {
    let total_time = std::time::Instant::now();
    for _ in 0..1000000 {
        without_cast();
    }
    println!("Total time taken without cast: {:?}", total_time.elapsed());

    let total_time = std::time::Instant::now();
    for _ in 0..1000000 {
        with_cast();
    }
    println!("Total time taken with cast: {:?}", total_time.elapsed());
}

I thought without_casting should be faster than the other but running this program I have these results:

Total time taken without cast: 92ns
Total time taken with cast: 32ns

I was expecting that the result were in the other way, someone knows why I am having these results? I run this with cargo run --release without release flag both functions takes the same time.


Solution

  • First of you, as it is already said in the comments from the other users, you should keep in mind, that the Rust compiler is very powerful, it optimizes many things, especially if you use the --release flag. So the best option is to look at the final binary, but it may take a while and it does not seem to be easy in practice.

    Second, I would recommend you the build-in benchmarks: https://doc.rust-lang.org/cargo/commands/cargo-bench.html (it works for Nightly toolchain). This is a good practice.

    If you are still curious about the casting performance, I prepared a brief example:

    #![feature(test)]
    
    extern crate test;
    
    #[cfg(test)]
    mod tests {
        use test::Bencher;
    
        #[bench]
        fn bench_without_cast(bencher: &mut Bencher) {
            let b = vec![10i32; 1000];
            bencher.iter(|| {
                let a = vec![10i32; 1000];
                assert_eq!(a, b);
            });
        }
    
        #[bench]
        fn bench_with_cast(bencher: &mut Bencher) {
            let b = vec![10i32; 1000];
            bencher.iter(|| {
                let a = vec![10.0f32; 1000];
                let a = a.into_iter().map(|x| x as i32).collect::<Vec<i32>>();
                assert_eq!(a, b);
            });
        }
    }
    

    Instead of a single a, I am working with a vector, casting all its elements. Since the vector is being allocated in the heap, the compiler does not apply so many optimizations, so we can try to measure the casting performance "honestly" (neglecting a new vector creation and some other border effects), as it would work for heap allocated data. As for static allocated data it is hard to judge, because of the lots of compiler optimizations in unpredictable places.

    The result on my machine is following:

    running 2 tests
    test tests::bench_with_cast    ... bench:         720.88 ns/iter (+/- 5.27)
    test tests::bench_without_cast ... bench:          69.03 ns/iter (+/- 0.23)
    
    test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out; finished in 0.64s