cnull-character

What is the difference between (char)0 and '\0'? in C


What is the difference between using (char)0 and '\0' to denote the terminating null character in a character array?


Solution

  • The backslash notation in a character literal allows you to specify the numeric value of a character instead of using the character itself. So '\1'[*] means "the character whose numeric value is 1", '\2' means "the character whose numeric value is 2", etc. Almost. Due to a quirk of C, character literals actually have type int, and indeed int is used to handle characters in other contexts too, such as the return value of fgetc. So '\1' means "the numeric value as an int, of the character whose numeric value is 1", and so on.

    Since characters are numeric values in C, "the character whose numeric value is 1" actually is (char)1, and the extra decoration in '\1' has no effect on the compiler - '\1' has the same type and value as 1 in C. So the backslash notation is more needed in string literals than it is in character literals, for inserting non-printable characters that don't have their own escape code.

    Personally, I prefer to write 0 when I mean 0, and let implicit conversions do their thing. Some people find that very difficult to understand. When working with those people, it's helpful to write '\0' when you mean a character with value 0, that is to say in cases where you expect your 0 is soon going to implicitly convert to char type. Similarly, it can help to write NULL when you mean a null pointer constant, 0.0 when you mean a double with value 0, and so on.

    Whether it makes any difference to the compiler, and whether it needs a cast, depends on context. Since '\0' has exactly the same type and value as 0, it needs to be cast to char in exactly the same circumstances. So '\0' and (char)0 differ in type, for exactly equivalent expressions you can either consider (char)'\0' vs (char)0, or '\0' vs 0. NULL has implementation-defined type -- sometimes it needs to be cast to a pointer type, since it may have integer type. 0.0 has type double, so is certainly different from 0. Still, float f = 1.0; is identical to float f = 1; and float f = 1.0f, whereas 1.0 / i, where i is an int, usually has a different value from 1 / i.

    So, any general rule whether to use '\0' or 0 is purely for the convenience of readers of your code - it's all the same to the compiler. Pick whichever you (and your colleagues) prefer the look of, or perhaps define a macro ASCII_NUL.

    [*] or '\01' - since the backslash introduces an octal number, not decimal, it's sometimes wise to make this a bit more obvious by ensuring it starts with a 0. Makes no difference for 0, 1, 2 of course. I say "sometimes", because backslash can only be followed by 3 octal digits, so you can't write \0101 instead of \101, to remind the reader that it's an octal value. It's all quite awkward, and leads to even more decoration: \x41 for a capital A, and you could therefore write '\x0' for 0 if you want.