swiftmemoryheap-memorystack-memory

Stack and heap misunderstanding in Swift


I've always known that reference type variables are stored in the heap while value type variables are stored in the stack. Recently, I found this picture that says that ints, doubles, strings, etc. are value types, while functions and closures are reference types: enter image description here

Now I'm really confused. So where are ints, doubles, strings, etc. are kept when they are defined inside a class, aka reference type? At the same type, where are functions are closures kept when defined inside a struct, aka value type?


Solution

  • I've always known that reference type variables are stored in the heap while value type variables are stored in the stack.

    This is only partially true in Swift. In general, Swift makes no guarantees about where objects and values are stored, except that:

    1. Reference types have a stable location in memory, so that all references to the same object point to exactly the same place, and
    2. Value types are not guaranteed to have a stable location in memory, and can be copied arbitrarily as the compiler sees fit

    This technically means that object types can be stored on the stack if the compiler knows that an object is created and destructed within the same stack frame with no escaping references to it, but in practice, you can basically assume that all objects are allocated on the heap.

    For value types, the story is a little more complicated:

    So where are ints, doubles, strings, etc. are kept when they are defined inside a class, aka reference type?

    This is an excellent question that gets at the heart of what a value type is. One way to think of the storage of a value type is inline, wherever it needs to be. Imagine a

    struct Point {
        var x: Double
        var y: Double
    }
    

    structure, which is laid out in memory. Ignoring the fact that Point itself is a struct for a second, where are x and y stored relative to Point? Well, inline wherever Point goes:

    ┌───────────┐
    │   Point   │
    ├─────┬─────┤
    │  x  │  y  │
    └─────┴─────┘
    

    When you need to store a Point, the compiler ensures that you have enough space to store both x and y, usually one immediately following the other. If a Point is stored on the stack, then x and y are stored on the stack, one after the other; if Point is stored on the heap, then x and y live on the heap as part of Point. Wherever Swift places a Point, it always ensures you have enough space, and when you assign to x and y, they are written to that space. It doesn't terribly matter where that is.

    And when Point is part of another object? e.g.

    class Location {
        var name: String
        var point: Point
    }
    

    Then Point is also laid out inline wherever it is stored, and its values are laid out inline as well:

    ┌──────────────────────┐
    │       Location       │
    ├──────────┬───────────┤
    │          │   Point   │
    │   name   ├─────┬─────┤
    │          │  x  │  y  │
    └──────────┴─────┴─────┘
    

    In this case, when you create a Location object, the compiler ensures that there's enough space to store a String and two Doubles, and lays them out one after another. Where that is, again, doesn't matter, but in this case, it's all on the heap (because Location is a reference type, which happens to contain values).


    As for the other way around, object storage has to components:

    1. The variable you use to access the object, and
    2. The actual storage for the object

    Let's say that we changed Point from being a struct to being a class. When before, Location stored the contents of Point directly, now, it only stores a reference to their actual storage in memory:

    ┌──────────────────────┐      ┌───────────┐
    │       Location       │ ┌───▶│   Point   │
    ├──────────┬───────────┤ │    ├─────┬─────┤
    │   name   │   point ──┼─┘    │  x  │  y  │
    └──────────┴───────────┘      └─────┴─────┘
    

    Before, when Swift laid out space to create a Location, it was storing one String and two Doubles; now, it stores one String and one pointer to a Point. Unlike in languages like C or C++, you don't actually need to be aware of the fact that Location.point is now a pointer, and it doesn't actually change how you access the object; but under the hood, the size and "shape" of Location has changed.

    The same goes for storing all other reference types, including closures. A variable holding a closure is largely just a pointer to some metadata for the closure, and a way to execute the closure's code (though the specifics of this are out of scope for this answer):

    ┌───────────────────────────────┐     ┌───────────┐
    │           MyStruct            │     │  closure  │
    ├─────────┬─────────┬───────────┤ ┌──▶│  storage  │
    │  prop1  │  prop2  │  closure ─┼─┘   │  + code   │
    └─────────┴─────────┴───────────┘     └───────────┘