c++cformatdiffuncrustify

How can I tell if two source files produce functionally identical code?


I'm using uncrustify to format a directory full of C and C++ code. I need to ensure that uncrustify won't change the resulting code; I can't do a diff on the object file or binaries because the object files have a timestamp and so won't ever be identical. I can't check the source of the files one by one because I'd be here for years.

The project uses make for the build process so I was wondering if there is some way to output something there that could be checked.

I've searched SO and Google to no avail, so my apologies if this is a duplicate.

EDIT: I'm using gcc/g++ and compiling for 32 bit.


Solution

  • One possibility would be to compile them with CLang, and get the output as LLVM IR. If memory serves, this should be command line arguments of -S -emit-llvm.

    To do the same with gcc/g++, you can use one of its flags to generate a file containing its intermediate representation at some stage of compilation. Early stages will still show differences from changes in white space and such, but a quick test indicates that by the SSA stage, such non-operational changes have disappeared from the IR.

    g++ -c -fdump-tree-ssa foo.cpp
    

    In addition to the normal object file, this will produce a file named foo.cpp.018t.ssa that represents the semantic actions in your source file.

    As noted above, I haven't tested this extensive though--it's possible that at this stage, some non-operational changes will still produce different output files (though I kind of doubt it). If necessary, you can use -fdump-tree-all to get output from all stages of compilation1. As a simple rule of thumb, I'd expect later stages to be more immune to changes in formatting and such, so if the ssa stage doesn't work, my next choice would probably be the optimized stage, which is one of the last stages (note: the files produced are numbered in order of the stage that produced each file, so when you dump all stages, it's obvious which are produced by early stages and which by later stages).


    1. Note that this produces quite a few files, many of them quite large. The first time you do this, you probably want to do it on a single source file in a directory by itself to keep from drowning in files, so to speak. Also, don't be surprised when compilation this way takes quite a bit longer than normal.