linuxlinux-kernelcall-graphcontrol-flow-graph

Constructing a complete control flow graph for Linux kernel


Are there any tools that can build the control flow graph for an entire Linux kernel binary? For example, consider Linux kernel compiled for x86 architecture (vmlinux file). Is it possible to determine all execution paths (regarding indirect call) using both static analysis and dynamic analysis? Are there any tools suitable for this?


Solution

  • I assume you mean analyzing the source code used to produce the Linux binaries.

    Hope you are prepared for a lot of work. There are reasons you can't get this off the shelf.

    You need two kinds of tools:

    1. Machinery to construct a control flow graph of individual C source files, that work for real dialects of C as used by the Linux kernel.

    2. Something that can construct a global call graph including indirect calls; if you don't handle indirect calls well, your call graph is either ridiculously over connected (the famous "scribble" diagram), or ridiculously underconnected (most functions won't be reachable).

    For [1],

    For [2],

    Once you have these pieces, you can consider what you might do with the result. I can tell you that a flow graph for a million line system will cover a football field at 1 inch resolution; you'll need serious computing power to traverse/analyze such a graph.

    If your intent was to analyze the Linux binaries (well you'd want to process the linker modules) directly, you don't have nearly as bad a problem of building the control flow graph because you don't have to deal with what amounts to most of a compiler. Now you just have to worry about the entire Intel instruction set. But if you model the machine instructions accurately, your CFG is likely to be 10x the size of one for the source code and a whole lot less helpful in tracing any issues back to the source.