To learn more of C, I'm trying to recreate basic data structures. Here's a minimal example of my attempt at an array, which compiles and runs but has a problem detected by valgrind:
#include <stdlib.h>
#include <stdio.h>
typedef void * vp_t;
typedef struct {
int len;
vp_t *start;
} arr_t;
arr_t * array_new(int len) {
arr_t *arr = malloc(sizeof(arr_t));
arr->start = malloc(len * sizeof(vp_t));
arr->len = len;
return arr;
}
void array_set(arr_t *arr, int i, vp_t vp) {
vp_t *dest = arr->start + i * sizeof(vp_t);
*dest = vp;
}
int array_get(arr_t *arr, int i) {
int *p = *(arr->start + i * sizeof(vp_t));
return *p;
}
void array_delete(arr_t *arr) {
free(arr->start);
free(arr);
}
int main() {
int x=0, y=1, z=2;
arr_t *arr = array_new(3);
array_set(arr, 0, &x);
array_set(arr, 1, &y);
array_set(arr, 2, &z);
for (int i = 0; i < 3; ++i) printf("%i ", array_get(arr, i));
putchar('\n');
array_delete(arr);
return 0;
}
The program outputs 1 2 3
as expected. However, valgrind is detecting a problem the second and third time I call the array_set function. Running valgrind against the example code here, I get:
==91933== Invalid write of size 8
==91933== at 0x109244: array_set (min.c:22)
==91933== by 0x109312: main (min.c:39)
==91933== Address 0x4a990d0 is 32 bytes before an unallocated block of size 4,194,032 in arena "client"
==91933==
==91933==
==91933== Process terminating with default action of signal 11 (SIGSEGV)
==91933== Access not within mapped region at address 0x2003A98F4C
==91933== at 0x109244: array_set (min.c:22)
==91933== by 0x109327: main (min.c:40)
min.c:22
refers to *dest = vp
in the array_set function. min.c:39
refers to array_set(arr, 1, &y)
. Valgrind does not complain about line 38, array_set(arr, 0, &x)
.
I've been poking around with gdb, but I've not figured it out yet. Thanks for taking a look.
The is wrong
vp_t *dest = arr->start + i * sizeof(vp_t);
In C, when you do pointer arithmetic, (i.e. add a number to a pointer), the compiler will take care of multiplying the number by the size of the objects pointed to. For example, if you have
int64_t a[50];
int *b = a;
int *c = &(a[21]);
b + 8
points to a[8]
not a[1]
- the compiler knows b
points to objects of size 8 bytes and multiplies the number added on to b
by 8. Similarly c - b
will be 21, not 168, because the compiler knows to divide the difference in addresses by the object size.
In my example, b + 8 * sizeof(int64_t)
would be the same as b + 64
and the compiler would then multiply that 64 by sizeof(int64_t)
to get the number of bytes to add to b
. This would clearly be outside of the array, which is what Valgrind is detecting in your case.
Another way to look at this is that a[i]
and *(a + i)
are functionally identical in C. You never see people writing a[i * (sizeof *a)]
do you? You don't need to do the multiplying for pointer arithmetic either.
The reason why you do need the sizeof
in malloc()
is because malloc()
can't infer the type of the objects that will be contained in the block.
There is another problem with your code. Your array stores pointers to the objects you want to store. This is presumably because you want to be able to store objects of any type. But if you do this, you have to be careful to make sure the objects don't disappear leaving you with dangling pointers.
For example, x, y and z are all of automatic scope. Their storage will disappear when the function exits. This isn't a problem with your code because exiting the function is the same as exiting the program. However, if you have something like this:
int populateArray(arr_t *array)
{
int x = 1, y = 2, z = 3;
array_set(arr, 0, &x);
array_set(arr, 1, &y);
array_set(arr, 2, &z);
}
int main()
{
arr_t *arr = array_new(3);
populateArray(arr);
// At this point your array contains 3 dangling pointers.
// Valgrind will complain if you try to access any of them.
}
that's broken.