rust

Should I choose the minimum integer size as much as possible?


While there are many integer types in Rust such as i32 and u8, I am not sure which type should I choose in terms of writing performant code. For example, which code is better in performance, A or B?

// A
let x: u8 = 1;
let must_be_i32: i32 = x as i32 + 1;
// B
let x: i32 = 1;
let must_be_i32: i32 = x + 1;

I am always writing like A to use less memory but writing type conversion so many times is tedious and feeling what am I doing is meaningless...

So my question is when to use singned/unsigned and i8/i6...i128 types.


Solution

  • No, at least not for the reasons you seem to be describing.

    You should be selecting your integer type based on the domain of values it represents, and not based on the value of a particular instance. An obvious one is the domain of lengths, capacities, and indexes: usize. A timestamp? Often u64. Bytes? clearly u8. Windows' cursor and window positions? Happen to be i32. Blockchain balances? u256 (not Rust native). If you're doing lots of numerical operations, you should consider the space of values those operations should work under. i32 is a decent starting point but you can obviously go higher if required.

    You will likely uncover the domain naturally by not explicitly typing what integer type it is and instead letting the compiler infer what it should be. The only time you should be using as is when those domains intersect.

    This has mostly been guidance that's not particularly performance framed but the reason for that is there is no point optimizing the integer size of local variables. The compiler has the most visibility into the values and operations done within a function and thus will make its own choices how to achieve that.

    For a more concrete justification, this "integer size optimized" function:

    pub fn f(data: &[i32; 1024]) {
        let offset: u8 = 5;
        for i in 0u8..100u8 {
            let value = data[i as usize] + offset as i32;
            std::hint::black_box(value);
        }
    }
    

    Compiles to the exact same instructions as the natural version:

    pub fn f(data: &[i32; 1024]) {
        let offset = 5;
        for i in 0..100 {
            let value = data[i] + offset;
            std::hint::black_box(value);
        }
    }
    

    Choosing u8 for these types did not save any memory. Note: black_box was used to avoid the loop being optimized away completely.

    Now, are there cases where local integer size could affect performance? Sure its possible in a function more complicated than I've shown above that smaller integers could reduce the stack frame size or change instructions emitted such that the binary is smaller. But your program performance would much more likely be bottlenecked by cache pressure from your in-memory data structures or some other limiting factor far more than any effects from local variable size of integers would.

    Related: Am I casting integer types too much?