cpointersscopec-stringsstorage-duration

C pointers: Function returns a pointer defined in its body


I'm learning C using the book C Programming: A Modern Approach and I have some doubts about the use of pointers and referencing an out-of-scope variables. I've put together three examples to illustrate my doubts.

  1. First, we have this:

    char *f() {
        char p[] = "Hi, I'm a string";
        return p;
    }
    

    It is my understanding that this code is problematic because p is a variable of type array of chars local to the scope of the function f(). When I return p, I'm using the name of the array as a pointer to the first element, but since the variable is only valid inside the function scope, when I return p, I end up with a pointer to a variable that is not valid (a dangling pointer?). As a result, if I try to use p outside the function, I get a segmentation fault.

    Is this correct?

  2. Next I have this:

    char *g() {
        char *p = "Hi, I'm a string";
        return p;
    }
    

    This is very close to the previous case, but it seems to be ok, I can access the string from outside the function and I don't get a warning from the compiler (as I did in the previous case). I don't understand why it works though, p is declared as a char pointer, so I'd assume that, when it's initialized, it points to the first char in the string literal, the same as when it as an array, is that not the case, or is it actually undefiled behavior that works out in my specific context?

  3. Finally, I have this:

    char *h() {
        char *p = malloc(17 * sizeof(char));
        strcpy(p, "Hi, I'm a string");
        return p;
    }
    

    Is this somehow different from the previous example? I assume it's not, I'm just allocating the memory manually. This also seems to let me access the entire string from outside the function, but I have the same doubts as with the previous example.

I tested the three functions this way:

int main(int argc, char *argv[]) {
   // This causes a segmentation fault
   printf("%s\n", f());
   // These two work ok
   printf("%s\n", g());
   printf("%s\n", h());
}

Can you help me better understand what's going on here? Thanks in advance!


Solution

  • As you have correctly pointed the first function returns an invalid pointer because the array declared within the function and having automatic storage duration will not be alive after exiting the function. Dereferencing such a pointer results in undefined behavior.

    Pay attention to that the initialization of the array by a string literal means that symbols of the string literal including its terminating zero character '\0' are copied in the extent of memory allocated for the array.

    In the second function there is returned a pointer to the first character of a string literal. String literals have static storage duration. So the used string literal will be alive after exiting the function and as a result the function returns a valid pointer.

    Bear in mind that though in C string literals have types of non-constant character arrays you may not change a string literals. So it would be better to declare the second function the following way

    const char *g( void ) {
        const char *p = "Hi, I'm a string";
        return p;
    }
    

    In the third function there is allocated memory dynamically. It will not be freed until function free will be called. So returned pointer is valid.

    To free the allocated memory you could write in main

    char *p = h();
    
    if ( p != NULL ) printf("%s\n", p);
    
    free( p );
    

    The function h itself should be defined the following way

    char *h( void ) {
        char *p = malloc(17 * sizeof(char));
        if ( p != NULL ) strcpy(p, "Hi, I'm a string");
        return p;
    }