If the loops are of the different type then I can easily identify them with the name but if there are multiple same type loops (say 5 while
loops), how can I identify what basic block in the LLVM IR corresponds to which loop in the source code?
Manually it is easy to identify as we visit the code and the LLVM IR sequentially but I am looking how we can identify the same programmatically.
Example, I have the below source code in C:
int main()
{
int count=1;
while (count <= 4)
{
count++;
}
while (count > 4)
{
count--;
}
return 0;
}
when I execute the comand clang -S -emit-llvm fileName.c
I got fileName.ll create with the below content:
; ModuleID = 'abc.c'
source_filename = "abc.c"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.23026"
; Function Attrs: noinline nounwind uwtable
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%count = alloca i32, align 4
store i32 0, i32* %retval, align 4
store i32 1, i32* %count, align 4
br label %while.cond
while.cond: ; preds = %while.body, %entry
%0 = load i32, i32* %count, align 4
%cmp = icmp sle i32 %0, 4
br i1 %cmp, label %while.body, label %while.end
while.body: ; preds = %while.cond
%1 = load i32, i32* %count, align 4
%inc = add nsw i32 %1, 1
store i32 %inc, i32* %count, align 4
br label %while.cond
while.end: ; preds = %while.cond
br label %while.cond1
while.cond1: ; preds = %while.body3, %while.end
%2 = load i32, i32* %count, align 4
%cmp2 = icmp sgt i32 %2, 4
br i1 %cmp2, label %while.body3, label %while.end4
while.body3: ; preds = %while.cond1
%3 = load i32, i32* %count, align 4
%dec = add nsw i32 %3, -1
store i32 %dec, i32* %count, align 4
br label %while.cond1
while.end4: ; preds = %while.cond1
ret i32 0
}
attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"clang version 4.0.0 (tags/RELEASE_400/final)"}
Now there are two basic blocks created for the given source file as while.cond
and while.cond1
, how can I identify which basic block is for which while loop in the source code?
Before I attempt to answer, I just want to note that depending on the selected optimization level or the manually selected pass with opt
that information might not be there or might not be as accurate (e.g. because of inlining, cloning, etc).
Now, the way to associate between low-level representations and source code is using debugging information (e.g. with the DWARF format). To produce debugging information you need to use the -g
command-line flag during compilation.
For LLVM IR, if you take a look at the Loop
API there are relevant calls like getStartLoc
. So you could do something like this (e.g. inside the runOn
method of a llvm::Function
pass):
llvm::SmallVector<llvm::Loop *> workList;
auto &LI = getAnalysis<llvm::LoopInfoWrapperPass>(CurFunc).getLoopInfo();
std::for_each(LI.begin(), LI.end(), [&workList](llvm::Loop *e) { workList.push_back(e); });
for(auto *e : workList) {
auto line = e->getStartLoc().getLine();
auto *scope = llvm::dyn_cast<llvm::DIScope>(e->getStartLoc().getScope());
auto filename = scope->getFilename();
// do stuff here
}
Moreover, for BasicBlock
, you can also use the debug-related methods in Instruction
(e.g. getDebugLoc
) and combine it with calls to other Loop
's methods such as getHeader
, etc.
Also, note that there is a getLoopID
method that uses an internal unique ID for each loop, but that is not always there and it's subject to the potential elisions I mentioned at the start. Anyhow, if you need to manipulate it, look at examples in LLVM source following the setLoopID
method (e.g. in lib/Transforms/Scalar/LoopRotation.cpp
).