cfuzzing

How to fuzz test API as a whole and not with file inputs?


I'm learning my way around fuzz testing C applications. As I understand it, most of the time when fuzzing, one has a C function that takes/reads files. The fuzzer is given a valid sample file, mutates it randomly or with coverage heuristics, and executes the function with this new input.

But now I don't want to fuzz a function that takes file inputs but a few functions that together make up an API. For example:

int setState(int state);
int run(void); // crashes when previous set state was == 123

The idea is to test the API as a whole and detect if misuse and calling functions in the wrong order (here: calling setState(123) followed with run()) crashes something somewhere.

How could one do such a thing? I'm searching for fuzzing frameworks (does not have to be C), general concepts and examples.

I tried to use libFuzzer from LLVM and "consumed" its fuzzer-data byte by byte. I read a single byte to determine what function to call, then read a parameter if needed, and finally call the function. Then I repeat until no more fuzzer-input-data is left. It looked something like this:

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    while(/* not end of fuzzer data reached */)
        switch (fuzzerConsumeByte()) {
        case 0:
            setState(fuzzerConsumeInt());
            break;
        case 1:
            run();
            break;
        default:
            break;
        }
    }
    return 0;
}

A source that mentions this fuzzing style I found was this:

[...] randomly select functions from your Public API and call them in random order with random parameters. code-intelligence

This seems not like a good or efficient use of an input file based fuzzer. Fuzzing with libFuzzer finds the bug after a few seconds though. But I think that if I extend the API with multiple other functions it will probably take a long time.


Solution

  • To answer my own question:

    Yes, that's how API fuzzing can be done. For consuming the data bytewise the functions provided by libFuzzer #include <fuzzer/FuzzedDataProvider.h> (C++) could be used. Problem with this: The crash dump and fuzzer corpus won't be human readable.

    For a more readable fuzzer, implementing a structure aware custom data mutator for libFuzzer is beneficial.

    I used the premade data mutator libprotobuf-mutator (C++) to fuzz the example API. It generates valid input data based on a protocol buffer definition and not just (semi) random bytes. It does make the fuzzing a bit slower though. The bug in the given contrived example API was found after ~2min, compared to ~30secs with the basic byte consuming setup. But I do think that it would scale much better for larger (real) API's.