Relatively new C programmer here. I am reviewing the following code for a tutorial for a side project I am working on to practice C. The point of the abuf
struct is to create a string that can be appended to. Here is the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct abuf {
char* str;
unsigned int size;
} abuf;
void abAppend(abuf *ab, const char *s, int len) {
char *new = realloc(ab->str, ab->size + len);
if (new == NULL) return;
memcpy(&new[ab->size], s, len);
ab->str = new;
ab->size += len;
}
int main(void) {
abuf ab = {
NULL,
0
};
char *s = "Hello";
abAppend(&ab, s, 5);
abAppend(&ab, ", world", 7);
return 0;
}
Everything compiles and my tests (redacted for simplicity) show that the string "Hello" is stored in ab's str
pointer, and then "Hello, world" after the second call to abAppend
. However, something about this code confuses me. On the initial call to abAppend
, the str
pointer is null, so realloc
, according to its man page, should behave like malloc
and allocate 5 bytes of space to store the string. But the string "Hello" also contains the terminating null byte, \0. This should be the sixth and final byte of the string, if I understand this correctly. Isn't this null byte lost if we store "Hello\0" in a malloc
-ed container large enough only to store "Hello"?
On the second call to abAppend
, we concatenate ", world" to str
. The realloc
will enlarge str
to 12 bytes, but the 13th byte, \0, is not accounted for. And yet, everything works, and if I test for the null byte with a loop like for (int i = 0; ab.str[i] != '\0'; i++)
, the loop works fine and increments i
12 times (0 thru 11), and stops, meaning it encountered the null byte on the 13th iteration. What I don't get is why does it encounter the null byte, if we don't allocate space for it?
I tried to break this code by doing weird combinations of strings, to no avail. I also tried to allocate an extra byte in each call to abAppend
and changed the function a little to account for the extra space, and it performed the exact same as this version. How the null byte gets processed is eluding me.
How does realloc treat null bytes in strings?
The behavior of realloc
is not affected by the contents of the memory it manages.
But the string "Hello" also contains the terminating null byte, \0. This should be the sixth and final byte of the string,…
The characters are copied with memcpy(&new[ab->size], s, len);
, where len
is 5. memcpy
copies characters without regard to whether there is a terminating null byte. Given length of 5, it copies 5 bytes. It does not append a terminating null character to those.
The
realloc
will enlargestr
to 12 bytes, but the 13th byte, \0, is not accounted for.
On the second called to abAppend
, 7 more bytes are copied with memcpy
, after the first 5 bytes. memcpy
is given a length of 7 and copies only 7 bytes.
… it encountered the null byte on the 13th iteration.
When you tested ab.str[12]
, you exceeded the rules for which the C standard defines the behavior. ab.str[12]
is outside the allocated memory. It is possible it contained a null byte solely because nothing else in your process had used that memory for another purpose, and that is why your loop stopped. If you attempted this in the middle of a larger program that had done previous work, that byte might have contained a different value, and your test might have gone awry in a variety of ways.