cclangclang-static-analyzer

Clang Analyser hints for tracking leaks


I'm working on a small library where it is convenient for me to have slice objects as we know them from Go or Rust; these are essentially a pointer and a length, and I can use these to pass sub-chunks of arrays around.

struct slice {
    size_t len;
    char *buf;
};

They are supposed to be passed by value, and they make a number of algorithms a lot easier to manage.

Of course, sometimes I also need to allocate the memory for the buffers they are slices off, and in a few different attempts at designs for that, I keep running into problems with clang's static analyzer. It is really hard to get it to track that I am allocating new memory without also assuming that I haven't initialized said memory.

I figured out that I could tell the analyzer that I have allocated memory with __attribute__((ownership_returns(malloc)) (but I have found no documentation for this, so I don't really know what I am doing there).

With code such as this:

struct slice *alloc_slice(size_t len) __attribute__((ownership_returns(malloc)));

void foo(void)
{
    struct slice *x = alloc_slice(100);
    if (x->len > 10) { // I promise you, this isn't uninitialised
        printf("foo!\n");
    }
    // Leak here
}

the analyzer will generally know that I have allocated memory for x, but it will assume that it is uninitialised, so it considers looking at x->len an error.

Is there any way to tell clang that I have allocated and initialised memory? It can handle it with calloc, so it should be able to. But is it something I can also do with my own functions?

I figured out that the analyzer could also see that I am using malloc if it can see the function in the compilation unit, for example by inlining it. So that was another attempt. The initialisation isn't that complicated, so I could live with that. However, the damned thing doesn't consider it a leak if I do this:

struct slice_buf {
    struct slice slice;
    char data[];
};

struct slice *alloc_slice_2(size_t len)
{
    struct slice_buf *buf = malloc(offsetof(struct slice_buf, data) + len);
    buf->slice.len = len;
    buf->slice.buf = buf->data;
    return (struct slice *)buf;
}

void bar(void)
{
    struct slice *x = alloc_slice_2(100);
    if (x->len > 10) { // I promise you, this isn't uninitialised
        printf("foo!\n");
    }
    // Leak here, but not detected
}

The problem, as far as I have managed to track it down, is the assignment buf->slice.buf = buf->data. The analyzer is a little too conservative here, and assumes that I am not leaking buf if there is a pointer inside buf that points to something inside buf. This is sensible, I have used a pointer into a member of a structure to manage the memory of the full structure in the past, and I wouldn't like that to be considered an error. It is just annoying here. If I don't assign to buf->slice.buf, then the analyzer detects a leak

struct slice *alloc_slice_3(size_t len)
{
    struct slice_buf *buf = malloc(offsetof(struct slice_buf, data) + len);
    buf->slice.len = len;
    // I need this, but the analyzer gets confused: buf->slice.buf = buf->data;
    return (struct slice *)buf;
}

void baz(void)
{
    struct slice *x = alloc_slice_3(100);
    if (x->len > 10) { // I promise you, this isn't uninitialised
        printf("foo!\n");
    }
    // Leak here, and now detected
}

but then, of course, the slice is also useless.

Is there any way that I can just annotate a function to say that what it returns must be freed, but trust me, it is initialised memory?

If you want to play with the code I have listed here, you can get it at Godbolt: https://godbolt.org/z/31Gh9qWMj

Update

I thought I would try another thing, but that makes the situation a little worse. My thinking was that if I could hide the initialisation from the analyser, maybe it would accept that the memory was initialised, and with a little luck it would still consider the pointer as something that must freed. Well, it halfway worked.

struct slice_buf {
    struct slice slice;
    char data[];
};
// Hiding the initialisation from the compiler...
void init_buf(struct slice_buf *buf, size_t len);
struct slice *alloc_slice_4(size_t len)
{
    struct slice_buf *buf = malloc(offsetof(struct slice_buf, data) + len);
    init_buf(buf, len);
    return (struct slice *)buf;
}

void qux(void)
{
    struct slice *x = alloc_slice_4(100);
    if (x->len > 10) { // This is fine 
        printf("foo!\n");
    }
    // Now the leak goes unnoticed once again
}

https://godbolt.org/z/3EWxGeYnY

It accepts that the memory is initialised, but assumes that the pointer is freed in the function I call. To be fair, a conservative analyser must assume this, as I might have called a destructor function, but it isn't the case here.

It is getting a little frustrating that the analyser only checks for leaks until you look at the pointer the wrong way, and I am pretty sure that I am doing something wrong here. It cannot possibly be the case that the checker so easily assumes that the memory is taken care of. But I have no idea whatsoever on how to annotate my functions to help the analyser do the right thing.

Update Update

Then I ran into another set of attributes that looked like they were designed for exactly this purpose: acquire_handle, use_handle and release_handle. Unfortunately, as far as I can see they are completely ignored. The compiler knows about them (at least -Wunknown-attributes doesn't complain), but the static analyser doesn't care.

[[clang::acquire_handle("slice")]]
struct slice *alloc_slice(size_t len);
[[clang::use_handle("slice")]]
void use_slice(struct slice *slice);
[[clang::release_handle("slice")]]
void free_slice(struct slice *slice);

void foo(void)
{
    struct slice *x = alloc_slice(100);
    use_slice(x);
    // Leak here -- not noticed
}

void bar(void)
{
    struct slice *x = alloc_slice(100);
    use_slice(x);
    // Freed twice, not noticed
    free_slice(x);
    free_slice(x);
}

(see https://godbolt.org/z/azx9ahEqj)

Since these attributes are so obviously useful for static analysis, I expect that I am simply running the analyser wrong, but I have no clue as to what I am supposed to be doing to get it to work.


Solution

  • Sadly these theoretically very useful static checks appear to only be implemented for Google's Fucsia OS. So you're not "holding it wrong". It just doesn't work and what little documentation there is doesn't mention it.