llvmllvm-irmachine-codecontrol-flow-graphssa

Machine Code based Control Flow Graph in LLVM


LLVM generally gives Control Flow Graphs (CFGs) for its intermediate representation (IR) language. You can also get high-level source-code-based CFGs with little effort. I want to get CFGs at the level of Machine Code. Is there any way to get this?

I did a little bit of digging around. In LLVM's back-end code generation phase, there's a stage called SSA-based Machine Code Optimizations. There's not much information on this stage. However, I guess LLVM generates a SSA-based machine code in some intermediate stage. If such a stage exists, then we can have Basic Blocks based on the code at that stage. With those Basic Blocks, a CFG could be created on that stage. Can anybody give any clue on the source-file that I have to look in the LLVM source tree (possibly in lib\CodeGen) to find any information regarding this? Or the class that would give me SSA-based Machine Code walk-through and Basic Blocks? I would appreciate any pointer.


Solution

  • I figured it out.

    You need to write MachineFunctionPass for some target in lib\Target\<target architecture> folder.

    Then in the runOnMachineFunction(MachineFunction &MF) function, you can view a CFG by calling the MF.viewCFG() function(in debug mode or with some tweaking inside the viewCFG to get CFG in Release mode as well).

    You can access MachineBasicBlock and MachineInstr through the iterator over MF. Following is an example:

    int i = 0;
    for (auto &MBB : MF) {
        errs() << "Basic Block: " << i++ << "\n\n";
        for (auto &MI : MBB) {
          MI.print(errs(), true, false);
          errs() << "\n";
        }
    }