cstringcharspecial-characters

Digit after null terminator in C?


TLDR: stuff like

char x[] = "some output \06 MORE OUTPUT";
puts(x);

gives unexpected results.

So I have this code:

#include <stdio.h>

int main(void)
{
    char x[] = "some output \0 MORE OUTPUT";
    puts(x);

    return 0;
}

And it prints: "some output", which is expected, because \0 means the end of the string. But if I add a digit just after the \0, the result changes. So the code

#include <stdio.h>

int main(void)
{
    char x[] = "some output \06 MORE OUTPUT";
    puts(x);

    return 0;
}

prints: some output ♠ MORE OUTPUT

So for some reason, if I add a number after \0, then \0 is no more recognized as the end of the string, and instead it prints this ♠ symbol.

If I just replace \0 with \6, the result is the same. If I do it with some other digits, the result is the same, except the symbol is different.

why is that? What is this \6 thing? I can't find anything about it online. How does it work?


Solution

  • \0 actually has no special meaning other than a zero character. By convention, we write null terminators using that form since it makes the syntax resemble other escape sequences like \n.

    But \ <digits> is actually a so-called octal escape sequence. Anything starting with \ will insert character into the string corresponding to an octal number. 06 being ACK or some such in ASCII, a non-printable character. \101 will print an 'A' and so on.

    Where the escape sequence ends isn't really that well-defined in C, the compiler may keep on reading any digits following \ for as long as they can be used to form valid character. Or until it spots another escape sequence or the end of the string literal. In case of octal escape sequences, at most 3 digits are allowed (but there's also hex escape sequences \x which holds no upper limit of digits).

    So \08 will for example result in a null terminator because the digit 8 cannot be part of an octal number. And \101666 will print A666 since no more than 3 valid octal digits are allowed.

    To insert the character '6' after a null terminator or other octal escape sequence, the easiest is just to end the string literal:

    char x[] = "some output \0" "6 MORE OUTPUT"; // string literals will get concatenated into one
    puts(x);
    char* secret_message = x + 13;
    puts(secret_message);
    

    Output:

    some output 
    6 MORE OUTPUT