The GCC manual says:
-fsanitize=bounds-strict
This option enables strict instrumentation of array bounds. Most out of bounds accesses are detected, including flexible array members and flexible array member-like arrays. Initializers of variables with static storage are not instrumented.
However, here, sum
can access its argument out-of-bounds, despite the use of -fsanitize=address,undefined,bounds-strict
(when offset
is chosen to reach into the wrong array):
#include <stdio.h>
#include <stdlib.h>
#define n 1000
typedef double array[n];
double sum(array a, long offset)
{
double acc = 0;
for(int i = 0; i < n; ++i)
acc += a[i + offset];
return acc;
}
int main()
{
array a;
array b;
for(int i = 0; i < n; ++i) a[i] = 1;
for(int i = 0; i < n; ++i) b[i] = 2;
long offset = b - a;
printf("%ld\n", offset);
printf("%f\n", sum(a, offset));
}
Compiling this with GCC 10.2 gives me:
$ gcc -pedantic -Wall -std=c17 -fsanitize=address,undefined,bounds-strict memory_errors_two_arrays.c && ./a.out
1032
2000.000000
So the code dereferenced a double[1000]
with 2031 and didn't even blink.
Does -fsanitize=bounds-strict
check anything that -fsanitize=address
does not?
Does -fsanitize=bounds-strict check anything that -fsanitize=address does not?
Yes.
-fsanitize=bounds-strict
is about checking array bounds, not the bounds of general objects. It uses compile-time information about array lengths, drawn largely from the types of the lvalues used for access, to generate instrumentation. That is orthogonal to the allocation size of dynamically-allocated objects, and the kind of information used is unavailable from many of the common idioms for creating and accessing dynamically allocated objects, such as the one in the original version of this question.
Consider this, for example:
#include <stdio.h>
int main(void) {
int a[2][256] = {0};
printf("%d\n", a[0][256]);
}
If I compile with -fsanitze=address
and run the result, it just prints "0".
If I compile with -fsanitize=bounds-strict
and run it, I get this report:
santest.c:7:24: runtime error: index 256 out of bounds for type 'int [256]'
(and it also prints "0").
Apparently, it is also notable that although the diagnostic information emitted by -fsanitize
instrumentation is styled as error messages, these do not represent fatal errors. The objective is for such informational messages to be in addition to the normal behavior of the program, not to replace or subvert the normal behavior. In particular, -fsanitize
does not abort the program when an incorrect memory access is detected, though some other mechanism might cause such a failure in some cases.
After the original version of this answer was posted, the question was modified to present a different situation. With respect to the version that is current as I write this, it is relevant that it is relatively easy to hide or moot the type information on which -fsanitize
's bounds-strict
mode relies. The example code currently presented in the question does this by interposing a function call interface between a declared array and the array access.
In this context, I observe that in the scope of
#define n 1000 typedef double array[n];
, this function declaration ...
double sum(array a, long offset)
is 100% equivalent to
double sum(double *a, long offset)
. The typedef
notwithstanding, it does not convey any information about how many more elements follow *a
in the array, if any, of which it is a member. This function must in fact accept pointers into arrays of arbitrary length, so not only is there is no information for bounds-strict
to use for generating instrumentation, it would be incorrect for it to be instrumented to assume a particular array length.
Contrast with this variation:
double access_flat(double *p, int i) {
return p[i];
}
double access_dim200(double (*p)[200], int i) {
return (*p)[i];
}
double access_dim100(double (*p)[100], int i) {
return (*p)[i];
}
int main(void) {
double *p = calloc(200, sizeof *p);
double q[2][100] = {0};
printf("p, flat: %lf\n", access_flat(p, 100));
printf("p, dim100: %lf\n", access_dim100((double (*)[100]) p, 100));
printf("p, dim200: %lf\n\n", access_dim200((double (*)[200]) p, 100));
printf("q, flat: %lf\n", access_flat((double *) q, 100));
printf("q, dim100: %lf\n", access_dim100(q, 100));
printf("q, dim200: %lf\n", access_dim200((double (*)[200]) q, 100));
}
When compiled with -fsanitize=bounds-strict
, the output of this program is:
p, flat: 0.000000
santest.c:14:16: runtime error: index 100 out of bounds for type 'double [100]'
p, dim100: 0.000000
p, dim200: 0.000000
q, flat: 0.000000
q, dim100: 0.000000
q, dim200: 0.000000
This shows several things, among them that
bounds-strict
is indeed using the type of the lvalue used for access to generate instrumentation, and-fsanitize
in my GCC (v8.5.0) treats automatically allocated (and statically allocated; not shown) arrays differently than dynamically allocated ones, and-fsanitize=bounds-strict
in my GCC is buggy for automatically allocated (and statically allocated; not shown) arrays, failing to report some array-bounds overruns.Bugs notwithstanding, again yes, bounds-check
mode does check some things that address
mode does not. For me, bounds-check
also requires an additional library, libubsan
, that address
mode by itself does not. The two modes have overlapping area of application, but their design is different, and each detects some issues that the other does not.