I am working on a program where a thread, which is spun off a main thread, parses email logs and organizes each entry by the QID, an ID unique to each message.
The program was working find until I reorganized it so that instead of loading the entire log file first into memory, I begin parsing each line as I read in the log file. The program will SIGABRT in the code snippet shown below (slightly simplified but demonstrates the same problem).
gdb will claim that it's either (depending on the run) realloc/free invalid size or realloc/free corrupt memory, while valgrind consistently claims a non-null terminated array causing a run off and corrupting the heap (which I believe is more likely the case but cannot find the problem). (Let me know if you would like me to share the raw output of valgrind or gdb).
void *parseMessageAndMatch(void *params) {
// get params passed in from parent thread
parse_params *rp = (parse_params *)params;
// len0: length of log dump for current QID
// len1: length of log dump after current buffer is appended
int len0, len1;
// tmp char pointer for log dump
char *tmp;
// 1. increase/allocate memory for log dump and append line buffer
// get length of current log dump (0 if NULL)
// dump is of type char* and contains the log lines pertaining to a certain QID
len0 = (!msgs.msgs[rp->index].dump) ? 0 : strlen(msgs.msgs[rp->index].dump);
// get length of log dump after current line buffer is appended
// rp->len contains the strlen of the line buffer
len1 = len0 + rp->len + 1;
// allocate space for tmp log dump
// lock becaus msgs is a global variable and there is more than one thread
pthread_mutex_lock(rp->mutex);
if ((tmp = realloc(msgs.msgs[rp->index].dump, sizeof(*tmp)*len1)) == NULL) {
printf("Cannot allocate memory for log dump.\n");
exit(-1);
}
// update pointer
msgs.msgs[rp->index].dump = tmp;
// set tmp to null
tmp = NULL;
// if original buffer was empty, zero it out for safety
if (len0 == 0)
bzero(msgs.msgs[rp->index].dump, sizeof(char)*len1);
// rp->line is malloc'ed then bzero'ed and strcpy'ed to contain the log line read from file
strcat(msgs.msgs[rp->index].dump, rp->line);
msgs.msgs[rp->index].dump[len1-1] = '\0'; // add null terminator
pthread_mutex_unlock(rp->mutex);
// more parsing code is here but was commented out during my debug process
// so I have omitted it from this code snippet
// free line buffer
free(rp->line);
// set to null to prevent double free
rp->line = NULL;
// free params
free(rp);
// prevent double free
rp = NULL;
}
made correction in answer below.
msgs.msgs[rp->index].dump[len1] = '\0'; // add null terminator
after
realloc(msgs.msgs[rp->index].dump, sizeof(*tmp)*len1)
is an out-of-bounds write. The last valid index is len1 - 1
.
With the code still aborting after that has been fixed, it might be that
// rp->line is malloc'ed then bzero'ed and strcpy'ed to contain the log line read from file
strcat(msgs.msgs[rp->index].dump, rp->line);
the strcpy
to rp->line
wrote outside the allocated buffer. The only other candidate I can see in the function is
msgs.msgs[rp->index].dump
If that is not 0-terminated, the strlen
can run off the allocated buffer, and the strcat
afterward can again write outside the allocated memory. The latter can be avoided by setting
msgs.msgs[rp->index].dump[len0] = 0;
after the realloc and before the strcat
.
Setting rp
to NULL
after freeing
free(rp);
// prevent double free
rp = NULL;
is pointless, since rp
is a local variable, and the function returns immediately after. The passed-in pointer params
still holds the old address (strictly, its value becomes indeterminate, but in practice, the bits won't change) in the caller. That opens the possibility that the caller believes the pointer still points to valid memory.