rustownership

Is a shared reference in Rust is cheaper than transferring ownership?


As far as I understand transferring ownership in Rust is a more expensive operation than transferring a value by shared or exclusive reference.

As far as I understand from What is ownership? from the Rust book, when passing ownership we copy the structure containing the pointer, length and capacity and pass ownership. When passing, for example, a shared reference, we are essentially passing just a 64-bit number (it depends on the architecture) because the reference is an abstraction over the pointer and the pointer is a number with additional information.

Of course, sometimes transferring possession is the only way to convey meaning, but still, this question arose.

I'd like to know how right I am, if at all.


Solution

  • As you already pointed out, its important to differentiate between concepts and what actually happens in the machine. These two are obviously related, but its helpful not to get hung on one or the other. In my personal experience, people tend to want to think about Rust's ownership model in terms of what happens in the machine, especially when it comes to "optimizations".

    Passing ownership can be a more expensive operation than passing a reference. However, the compiler is free to avoid expensive operations as long as it adheres to the concept. For example:

    fn takes_a_slice(value: &[i32]) {
        todo!()
    }
    
    fn takes_a_vec(value: Vec<i32>) {
        todo!()
    }
    
    
    fn main() {
        let v = vec![1,2,3];
    
        takes_a_slice(&v);
    
        takes_a_vec(v);
    }
    

    In the example above, takes_a_slice() is passed a shared reference, which is about as cheap an object as one can have (see below). takes_a_vec() is passed ownership, which conceptually requires moving the Vec into takes_a_vec(), basically a pointer and two usize, which can be thought of as "more expensive". However, that doesn't mean that takes_a_vec() is actually more expensive in a real machine: The compiler is free to apply any "as-if" optimization it wants, especially avoiding useless copies, and inlining takes_a_vec() into main(); when that happens, takes_a_vec() can access its argument in exactly the memory location where main() put it in the first place, so nothing actually gets moved in the machine, while we still adhere to the concept: It is invalid to call takes_a_slice(&v) after takes_a_vec() has been called, because v has conceptually moved away; it doesn't matter that it didn't get copied around in the actual machine. Likewise, the compiler is free to re-use the stackspace previously occupied by v once takes_a_vec(v) has been called, because we know that v's ownership was moved out of main; whatever happens to value inside takes_a_vec, once we return to main, the place where v used to be stored must have been invalidated.

    Another example, on the other side of the spectrum: Passing bool via ownership or via a &bool-reference is almost guaranteed to make no difference. ABI-requirements, cache-line widths, register pressure and a whole lot of other reasons will invalidate any gain that could possibly arise from passing "just a single byte" vs. "passing a 8-byte wide pointer to a single byte". However, it does make a huge difference conceptually if a callee gets a bool or a &bool, because a bool is an independent copy while a &bool guarantees that the value can't change. For simple examples, the compiler will generate exactly the same code, though.

    Specifically about passing references: Yes, you are correct. A shared/exclusive reference is an abstraction over a pointer that conveys "more" meaning than just a raw pointer; all references are pointers, but not all pointers are references. In fact, even raw pointers (*const, *mut) have more meaning than just a naked address in memory, because they carry provenance. In the Rust Abstract Machine (which isn't completely specified as of now), whenever we use a reference, the Abstract Machine can use the knowledge that references always point to a fully initialized value, and the knowledge that the pointer beneath the reference cannot possibly point outside it's original allocation. In the concrete machine (like x86-64), the entire concept boils down to addresses, which are plain integers.