So, I have this piece of code in c that is supposed to read from an ascii file and then return a string that is parsed and compiled into bytecode. When I try to get the size of a file using ftell, it returns 33 rather than 29 (the amount of bytes that the file has including new lines).
the file:
push #25
nop
push #44
add
hlt
the code:
uint8_t* read_ascii_file(uint8_t* path) {
FILE* file = fopen(path, "r");
if (file == NULL)
return NULL;
fseek(file, 0, SEEK_END);
uint32_t size = ftell(file);
fseek(file, 0, SEEK_SET);
uint8_t* buffer = (uint8_t*)malloc(sizeof(uint8_t) * size);
if (buffer == NULL)
return NULL;
fread(buffer, sizeof(uint8_t), size, file);
buffer[size] = '\0';
fclose(file);
return buffer;
}
Four more bytes are added for seemingly no reason. My parser complains that "hlt====" is not a valid instruction ("====" are the 4 extra bytes).
Ftell or fseek is messing up in this situation and I don't know if it is a bug or not
I'm on Windows and I'm also using visual studio
You're not allocating enough space.
You allocate sizeof(uint8_t) * size
bytes for the file contents, however this isn't enough to hold the null byte that you later add to terminate the string. So you write past the bounds of allocated memory, triggering undefined behavior.
Add 1 to leave space for the null terminator.
uint8_t* buffer = (uint8_t*)malloc(sizeof(uint8_t) * size + 1);
Also, these lines are problematic:
fread(buffer, sizeof(uint8_t), size, file);
buffer[size] = '\0';
Because of the translation of newlines on windows, you'll end up reading in less bytes then the file actually contains. This means the teminating null byte is in the wrong place.
Save the return value of fread
which will tell you how many bytes you actually read, then use that value to write the null byte.
size_t rval = fread(buffer, sizeof(uint8_t), size, file);
if (rval > 0) {
buffer[rval] = '\0';
}