ccstring

Cannot modify C string


Consider the following code.

int main(void) {
    char * test = "abcdefghijklmnopqrstuvwxyz";
    test[5] = 'x';
    printf("%s\n", test);
    return EXIT_SUCCESS;
}

In my opinion, this should print abcdexghij. However, it just terminates without printing anything.

int main(void) {
    char * test = "abcdefghijklmnopqrstuvwxyz";
    printf("%s\n", test);
    return EXIT_SUCCESS;
}

This however, works just fine, so did I misunderstand the concept of manipulating C strings or something? In case it is important, I'm running Mac OS X 10.6 and it is a 32-bit binary I'm compiling.


Solution

  • This answer is good, but not quite complete.

    char * test = "abcdefghijklmnopqrstuvwxyz";
    

    A string literal refers to an anonymous array object of type char[N] with static storage duration (meaning it exists for the entire execution of the program), where N is the length of the string plus one for the terminating '\0'. This object is not const, but any attempt to modify it has undefined behavior. (An implementation can make string literals writable if it chooses, but most modern compilers do not.)

    The declaration above creates such an anonymous object of type char[27], and uses the address of that object's first element to initialize test. Thus an assignment like test[5] = 'x' attempts to modify the array, and has undefined behavior; typically it will crash your program. (The initialization uses the address because the literal is an expression of array type, which is implicitly converted in most contexts to a pointer to the array's first element.)

    Note that in C++, string literals are actually const, and the above declaration would be illegal. In either C or C++, it's best to declare test as a pointer to const char:

    const char *test = "abcdefghijklmnopqrstuvwxyz";
    

    so the compiler will warn you if you attempt to modify the array via test.

    (C string literals are not const for historical reasons. Before the 1989 ANSI C standard, the const keyword did not exist. Requiring it to be used in declarations like yours would have made for safer code, but it would have required existing code to be modified, something the ANSI committee tried to avoid. You should pretend that string literals are const, even though they aren't. If you happen to be using gcc, the -Wwrite-strings option will cause the compiler to treat string literals as const -- which makes gcc non-conforming.)

    If you want to be able to modify the string that test refers to, you can define it like this:

    char test[] = "abcdefghijklmnopqrstuvwxyz";
    

    The compiler looks at the initializer to determine how big test needs to be. In this case, test will be of type char[27]. The string literal still refers to an anonymous mostly-read-only array object, but its value is copied into test. (A string literal in an initializer used to initialize an array object is one of the contexts in which an array does not "decay" to a pointer; the others are when it's the operand of unary & or sizeof.) Since there are no further references to the anonymous array, the compiler may optimize it away.

    In this case, test itself is an array containing the 26 characters you specified, plus the '\0' terminator. That array's lifetime depends on where test is declared, which may or may not matter. For example, if you do this:

    char *func(void) {
        char test[] = "abcdefghijklmnopqrstuvwxyz";
        return test; /* BAD IDEA */
    }
    

    the caller will receive a pointer to something that no longer exists. If you need to refer to the array outside the scope in which test is defined, you can define it as static, or you can allocate it using malloc:

    char *test = malloc(27);
    if (test == NULL) {
        /* error handling */
    }
    strcpy(test, "abcdefghijklmnopqrstuvwxyz";
    

    so the array will continue to exist until you call free(). The non-standard strdup() function does this (it's defined by POSIX but not by ISO C).

    Note carefully that test may be either a pointer or an array depending on how you declare it. If you pass test to a string function, or to any function that takes a char*, that doesn't matter, but something like sizeof test will behave very differently depending on whether test is a pointer or an array.

    The comp.lang.c FAQ is excellent. Section 8 covers characters and strings, and question 8.5 points to question 1.32, which addresses your specific question. Section 6 covers the often confusing relationship between arrays and pointers.