I first started going down this rabbithole after learning that VLAs (variable length arrays) are not compatible with C++. This is due to the fact that an array of variable length would not have a size that is a constant at compile time, and the C++ compiler would need that in order to compute the size of the stack frame that the program is going to allocate at runtime. I understand that this desire to know such a thing at compile time is why VLA arrays are prohibited (among other reasons).
However, why is there a desire to know such a thing (the size of a stack frame to be allocated) at compile time? Why can't all of the memory needed for the stack be determined and then allocated at runtime? I've heard it explained that determining the size of the stack frame at compile-time can be useful for avoiding stack-overflow exceptions since you know exactly how much memory will be needed for a given function call ahead of time. If we were to allow a VLA on the stack who's size was determined by user input at runtime, they could input a giant number like 100000000000 and then we get a stack overflow exception. Prohibiting the use of a VLA is saving us from doing something like that (or at least that is my understanding).
That's how it was explained to me, but I still don't understand why that means everything has to be determined at compile time. For example, alloca
can be used to allocate on the stack at runtime by simply moving the stack pointer. Why can't this be done internally for each new local variable in a function at runtime so we can have VLAs instead of allocating the whole stack frame at once? I feel like the answer is supposed to be "well obviously so you don't end up trying to allocate more memory than you have and cause a stack overflow exception!", but isn't this a risk no matter what technique you employ? I could just try to declare my variable length array on the heap at runtime, but what if I try to allocate more memory than I have available? In this case, I ran into the same issue of biting off more memory than I could chew due to user input, and calculating the memory usage for the program at compile-time can do nothing to save me.
So, is the reason this is done at compile time solely for the sake of efficiency? If it is, what makes it more efficient? Is it just the fact that we're calculating how much memory needs to be allocated for the function call ahead of time rather than at runtime, saving us microseconds each time the function is called?
In order to generate instructions that access local variables, the compiler needs to know the offset of each local variable in the stack during compilation time, and of course the offset cannot change during execution.
Since the offset of every local variable must be fixed, and since local variables are stored in sequential locations in the stack, the size of every local variable must be fixed.
That's why variable-length arrays (and variable-length anything for that matter) are difficult to implement on the stack in low-level languages that aim to be highly efficient.
The function alloca()
can be used to allocate on the stack at runtime, and that may actually be how variable-length arrays are implemented by those compilers that go above and beyond the standard and offer variable-length arrays. (I do not really know, I am hypothesizing here.)
However:
alloca()
returns a pointer, which means that we now have to have an extra hidden local variable to store that pointer.sizeof()
operator, which must work in a very different way for variable-length-arrays than for anything else.When emitting the code of a function, the compiler generates (among other things) instructions that access local variables.
An instruction accesses a local variable using the offset of that local variable in the stack. (The offset of the variable from the stack pointer or from the base pointer if a base pointer is used.)
For example:
mov ax, [bp-8]
Here, the register ax
is being loaded with the contents of the local variable at offset 8
from the base pointer (bp
). The base pointer points at the location which was the top of the stack when the function was entered. This instruction is making use of a special addressing mode for accessing memory at a fixed offset from a certain register. The instruction requires that 8
to be a constant, baked into the instruction, it cannot be something else. (If that 8 was to be a variable, then it would immensely complicate accessing local variables, possibly requiring multiple instructions for each access, with a corresponding performance penalty, and on top of that the compiler would have to be considerably more complicated.)