clinkerclangclang-static-analyzerscan-build

Can scan-build or Clang static analyzer discover problems at link time?


While revisiting some codes I've written, I noticed that the build commands in the test scripts did not correctly invoke the scan-build command. The formation of a revision is ready, but I have some question with regard to the capability of scan-build and the Clang static analyzer.

Can the analyzer discover errors at link time? How to do that?

For example, within a single source file, it's easy to discover memory allocation errors (leak, double-free, free-after-use, etc.), but can it still discover such errors when it's done through interface functions implemented in another translation unit?

I've written 2 files for testing whether it can do that, but apparently it cannot.

/* memlib.c */
#include <stdlib.h>
void *foo_alloc(int len) { return malloc(len * 4); }
void foo_dealloc(void *foo) { return free(foo); }
/* mem-main.c */
void *foo_alloc(int len);
void foo_dealloc(void *foo);

int main()
{
    int *p;

    p = foo_alloc(2);
    p[1] = 32;
    p = foo_alloc(1);
    p[0] = 54;
    foo_dealloc(p);
    p[0] = 47;
    foo_dealloc(p);

    return 0;
}

The compilation command:

scan-build sh -c '$CC "$@"' foo -o mem-main mem-main.c memlib.c

I'm using the scan-build from PyPI, but I think that's pretty much irrelevant as it's just a program driver.

As a side note, I'm open to tool recommendations where link-time analysis can be performed.


Solution

  • Clang has experimental support for analyzing across translation units. See the Clang documentation for Cross Translation Unit (CTU) Analysis. However, it's currently (2022-05-23) a fairly messy proposition, as explained in the linked document. A summary of the basic steps is:

    1. Use clang++ -emit-ast to create .ast files for each translation unit (TU).
    2. Use clang-extdef-mapping to make a list of definitions in each TU.
    3. Used sed (!) to make ad-hoc fixes to the definition list files, specifically, changing ".cpp" to ".cpp.ast" and changing file paths to be relative.
    4. Run the analysis like this:
    $ clang++ --analyze \
        -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
        -Xclang -analyzer-config -Xclang ctu-dir=. \
        -Xclang -analyzer-output=plist-multi-file \
        main.cpp
    

    Does it work? The presentation Using the Clang Static Analyzer by Vince Bridgers at an LLVM meetup in 2020, slide 25, shows that cross-translation-unit analysis approximately doubled the number of findings across five code bases. Some findings are lost as well, but that will be a mix of lost false positives (good) and lost true positives (bad), and that presentation doesn't further elaborate. (My guess, though, is the majority are lost FPs.)

    Regarding tool recommendations, one of the main ways that commercial static analysis tools differ from the open source tools is more accurate inter-procedural and cross-translation-unit analysis. If this is of particular interest, you may want to look into available commercial tools. (Disclosure: I formerly worked for a commercial static analysis vendor and have related ongoing financial interests.)