cmemorymemory-layout

What does affect the variable placement in memory?


I am playing with a buffer overflow in C. I have the following code:

int foo(void*, void*);        // Calculates the distance (in bytes) between two addresses in memory

int main(int argc, char** argv) {
   int a = 15;
   int b = 16;
   int c = 90;

   char buffer[4];
   
   /* Memory layout */
   printf("[LAYOUT]\n");
   printf("foo(&a, &b) is %d\n", foo(&a, &b));
   printf("foo(&a, &c) is %d\n", foo(&a, &c));
   printf("foo(&a, &string) is %d\n\n", foo(&a, &string));

   /* Memory content before copying into the buffer */
   printf("[BEFORE]\n");
   printf("a is at %p and is %d (0x%08x)\n", &a, a, a);
   printf("b is at %p and is %d (0x%08x)\n", &b, b, b);
   printf("c is at %p and is %d (0x%08x)\n", &c, c, c);
   printf("string is at %p and is %s\n\n", &string, string);

   strcpy(buffer, "aaaaaaaaa");

   /* Memory content after copying into the buffer */
   printf("[AFTER]\n");
   printf("a is at %p and is %d (0x%08x)\n", &a, a, a);
   printf("b is at %p and is %d (0x%08x)\n", &b, b, b);
   printf("c is at %p and is %d (0x%08x)\n", &c, c, c);
   printf("string is at %p and is %s\n", &string, string);

   return EXIT_SUCCESS;
}

int foo(void* addr_1, void* addr_2) {
   return (addr_1 - addr_2);
}

After the compilation with gcc main.c -o main -O0 -g -fno-stack-protector -D_FORTIFY_SOURCE=0 flags, with optimization turned off, the output is following (on my machine):

[LAYOUT]
foo(&a, &b) is 4
foo(&a, &c) is 8
foo(&a, &string) is 12

[BEFORE]
a is at 0x7ffee13d5b68 and is 16 (0x00000010)
b is at 0x7ffee13d5b64 and is 15 (0x0000000f)
c is at 0x7ffee13d5b60 and is 90 (0x0000005a)
string is at 0x7ffee13d5b5c and is 

[AFTER]
a is at 0x7ffee13d5b68 and is 16 (0x00000010)
b is at 0x7ffee13d5b64 and is 97 (0x00000061)
c is at 0x7ffee13d5b60 and is 1633771873 (0x61616161)
string is at 0x7ffee13d5b5c and is aaaaaaaaa

Obviously, the buffer is located at the leftmost position, before integer variables. I can think of it as:

0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 0x63 0x64 0x65
0x61 0x61 0x61 0x61 0x61 0x61 0x61 0x61 0x61 0x00

It completely overwrites c's data (all four bytes) and the one byte of b's data (little-endian machine).

After compiling the same program with the optimization turned on, -O1 for example, it produces the output:

[LAYOUT]
foo(&a, &b) is -4
foo(&a, &c) is -8
foo(&c, &string) is 12
foo(&a, &string) is 4

[BEFORE]
a is at 0x7ffee056db3c and is 16 (0x00000010)
b is at 0x7ffee056db40 and is 15 (0x0000000f)
c is at 0x7ffee056db44 and is 90 (0x0000005a)
string is at 0x7ffee056db38 and is 

[AFTER]
a is at 0x7ffee056db3c and is 1633771873 (0x61616161)
b is at 0x7ffee056db40 and is 97 (0x00000061)
c is at 0x7ffee056db44 and is 90 (0x0000005a)
string is at 0x7ffee056db38 and is aaaaaaaaa

It seems like integer variables are placed in memory in reversed order. To prevent buffer from overflow, I can rearrange variables and turn off optimization gcc main.c -o main -O0 -g -fno-stack-protector -D_FORTIFY_SOURCE=0, like so:

int main(int argc, char** argv) {
   char buffer[4];

   int a = 15;
   int b = 16;
   int c = 90;
   ...
}

which causes the buffer to be placed in higher memory, after the integer variables (keeping in mind that stack grows to lower addresses).

The questions are:

  1. Does the optimization flag affect the order of variables in memory? (in case of -O1)
  2. With optimization turned off, are variables placed in memory in reversed order they defined in C?

Solution

  • The compiler places variables in memory in whatever order is most convenient; nothing in the C standard applies. (So that's different from members of a struct, which must be placed in order although each member may be followed by unspecified padding).

    The compiler's decision about variable placement is likely to vary based on optimisation settings, and other compilation options. At some optimisation settings, it may avoid allocating any memory for a variable whose address is never directly used. In such cases, your decision to use a variable's address (even just to print it) might affect the placement of other variables, something you might want to think about.

    When choosing a placement order, a compiler will typically take into account its own inferences about usage patterns, trying to optimise locality of reference, cachability, and other artefacts of the target machine's memory architecture. It is not likely to take into account the convenience of programmers trying to write buffer overflow exploits.