dynamicrustheap-memorystack-memorymemory-safety

Ownership tracking in Rust: Difference between Box<T> (heap) and T (stack)


Experimenting with the programming language Rust, I found that the compiler is able to track a move of a field of some struct on the stack very accurately (it knows exactly what field has moved). However, when I put one part of the structure into a Box (i.e. putting it onto the heap), the compiler is no longer able to determine field-level moves for everything that happens after the dereference of the box. It will assume that the whole structure "inside the box" has moved. Let's first see an example where everything is on the stack:

struct OuterContainer {
    inner: InnerContainer
}

struct InnerContainer {
    val_a: ValContainer,
    val_b: ValContainer
}

struct ValContainer {
    i: i32
}


fn main() {
    // Note that the whole structure lives on the stack.
    let structure = OuterContainer {
        inner: InnerContainer {
            val_a: ValContainer { i: 42 },
            val_b: ValContainer { i: 100 }
        }
    };

    // Move just one field (val_a) of the inner container.
    let move_me = structure.inner.val_a;

    // We can still borrow the other field (val_b).
    let borrow_me = &structure.inner.val_b;
}

And now the same example but with one minor change: We put the InnerContainer into a box (Box<InnerContainer>).

struct OuterContainer {
    inner: Box<InnerContainer>
}

struct InnerContainer {
    val_a: ValContainer,
    val_b: ValContainer
}

struct ValContainer {
    i: i32
}


fn main() {
    // Note that the whole structure lives on the stack.
    let structure = OuterContainer {
        inner: Box::new(InnerContainer {
            val_a: ValContainer { i: 42 },
            val_b: ValContainer { i: 100 }
        })
    };

    // Move just one field (val_a) of the inner container.
    // Note that now, the inner container lives on the heap.
    let move_me = structure.inner.val_a;

    // We can no longer borrow the other field (val_b).
    let borrow_me = &structure.inner.val_b; // error: "value used after move"
}

I suspect that it has something to do with the nature of the stack vs. the nature of the heap, where the former is static (per stack frame at least), and the latter is dynamic. Maybe the compiler needs to play it safe because of some reason I cannot articulate/identify well enough.


Solution

  • In the abstract, a struct on the stack is kind of just a bunch of variables under a common name. The compiler knows this, and can break a structure into a set of otherwise independent stack variables. This lets it track the movement of each field independently.

    It can't do that with a Box, or any other kind of custom allocation, because the compiler doesn't control Boxes. Box is just some code in the standard library, not an intrinsic part of the language. Box has no way of reasoning about different parts of itself suddenly becoming not valid. When it comes time to destroy a Box, it's Drop implementation only knows to destroy everything.

    To put it another way: on the stack, the compiler is in full control, and can thus do fancy things like breaking structures up and moving them piecemeal. As soon as custom allocation enters the picture, all bets are off, and the compiler has to back off and stop trying to be clever.