cstringstring-literals

Initializing an array of char - Is the string used to initialize the array stored separately in memory in addition to where the array is stored


In the book entitled Understanding and Using C Pointers by Richard Reese, on page 110 the author states

"An array of char can be initialized using the initialization operator. In the following example, a header array is initialized to the character contained in a string literal:

char header[] = "Media Player";

At a later point it states:

"...The initialization will copy these characters to the array terminated by the NUL character, as illustrated in Figure 5-2, assuming the declaration is located in the main function.

In Figure 5-2 (Initializing an array of char) on page 111 it shows the header array located on the stack with its memory address but it also shows the string that was used to initialize the array located in the String Literal Pool at a different memory address.

I find this at odds with what I have read in the book by K.N. King C Programming A Modern Approach 2nd Ed. where on on page 281 an array is initialized as follows

char date1[8] = "June 14";

Later on page 282 it states:

Although "June 14" appears to be a string literal, it's not. Instead, C views it as an abbreviation for an array initializer.

Based on the latter text, and my limited understanding, I am thinking that the text Understanding and Using C Pointers is incorrect in stating that "Media Player" is a string literal when used to initialize the array -- it doesn't appear to be simply using the term 'string literal' loosely as a synonym for string as the text then goes on to incorrectly(?) show that "Media Player" is located in the String Literal Pool with its own memory address.

Is my understanding correct?


Solution

  • There are two levels on which we can consider what a C program does. The C standard specifies the semantics of a C program using a model of an abstract machine. When a C implementation translates and executes a program, it can implement the program in any way that produces the same observable behavior, which is how the program accesses volatile objects, writes data into files, and interacts with input/output devices. This allows compilers to optimize programs.

    In the abstract level, every string literal causes creation of an array containing the characters of the string literal, per C 2018 6.4.5 6:

    In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence…

    This array that is created and initialized is different from a named array the program might also be initializing. For example, in char header[] = "Media Player";, there is an unnamed array containing the characters of “Media Player” and a null character, and there is the named array header which is initialized by copying the unnamed array into the named array.

    In the actual C implementation, compilers will generally optimize array initializations in various ways depending on context:

    Footnote

    1 We might almost consider this to be identical to the abstract semantics: The object module contains the array created for the string literal, and the array in memory is initialized with it. Thus, there are indeed two arrays, one unnamed array in the object module and a named array in memory. However, in the abstract model, the array created for the string literal is in the memory of the abstract computer. It would, hypothetically, have an address (even though we never access its address). The data in the object module is not in the program’s memory this way, so it is not actually identical to the abstract semantics.