According to the C Standard, the signature of printf()
is:
int printf(const char * restrict format, ...);
As I understand, the meaning of the restrict
is that format
will be the only reference to the pointed-to data within the lifetime of the pointer. This enables optimizations for reasons I do not fully grasp. But does this mean that undefined behavior is invoked if I reuse memory for the format string as an argument? Even though, as far as I can tell, the format string is not required to be a string literal?
static const char str[] = "%sHello\n";
printf(str, str + 2); // Hello\nHello\n or UB?
I know the implementation might reuse the memory for identical or identically-ending string literals:
"foo" + 1 == "oo"; // Might be true
Does that mean that the following:
printf("%sHello\n", "Hello\n");
Might behave in a nonsensical manner if an implementation makes both string literals share memory, thus violating the restrict
constraint?
As I understand, the meaning of the
restrict
is thatformat
will be the only reference to the pointed-to data within the lifetime of the pointer.
Not quite. It means that if the format
string1 is modified by any means (through format
or otherwise), then printf
must access it only through other expressions based on format
(and therefore the caller must not pass another argument that would result in printf
accessing the format string through that argument), per C 2018 6.7.3.1. If nothing in the format string is going to be modified during the printf
call, it does not matter what other pointers to it there are.
I think the only way this can matter is if you use the n
conversion specifier, which says the corresponding argument is a pointer to a signed integer into which is written the number of character written to the output stream so far by this call to printf
. So, if the format
were not restrict-qualified
you could write these two printf
calls:
int n;
memcpy(&n, "x%n", 4);
printf((char *) &n, &n); // Would store 1 in `n`, since “x” had been written.
const char String[] = "Hello, world.\n%n";
int *p = malloc(sizeof String);
memcpy(p, String, sizeof String);
printf(p, p); // Would store int `14` at `p`.
The former would be legal because any object can be read through a character type, which printf
presumably does in effect. The latter is legal because the effective type of dynamic memory is malleable, so writing to p
as an int
after it has been used as a character string is defined.
Obviously, it would be problematic if the %n
caused a write to memory that printf
was possibly yet to use for the format string. Again, this is in the context where we assume the string is not qualified with restrict
. Given that it is, the behavior of using %n
in this way would not be defined.
(n
can also be used with modifiers, as in %hhn
to write a char
instead of an int
, but this does not affect the above analysis.)
1 Technically, any object based on format
, which is essentially all the elements of the array object that format
points to (which includes earlier elements in the array if format
points into the middle of an array).