I'm writing a clang static analyzer checker for a pair of functions that save the passed argument value and return it:
void set(const int& value);
const int& get();
Real set implementation saves passed value to an internal variable of type int, get returns the reference to this variable.
This is my implementation:
#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
using namespace clang;
using namespace ento;
namespace {
class checkerTest : public Checker<eval::Call> {
bool handleSet(CheckerContext &C, const CallEvent &Call) const;
bool handleGet(CheckerContext &C, const CallEvent &Call) const;
public:
bool evalCall(const CallEvent &Call, CheckerContext &C) const;
using FnHandler = bool (checkerTest::*)(CheckerContext &, const CallEvent &Call) const;
CallDescriptionMap<FnHandler> Functions = {
{{{"set"}, 1}, &checkerTest::handleSet},
{{{"get"}, 0}, &checkerTest::handleGet},
};
};
} // namespace
SVal g_value;
bool checkerTest::handleSet(CheckerContext &C, const CallEvent &Call) const {
SVal location = Call.getArgSVal(0);
QualType LoadTy = Call.getArgExpr(0)->getType();
ProgramStateRef State = C.getState();
SVal Value = State->getSVal(location.castAs<Loc>(), LoadTy);
C.addTransition(State);
g_value = Value;
return true;
}
bool checkerTest::handleGet(CheckerContext &C, const CallEvent &Call) const {
ProgramStateRef State = C.getState();
State = State->BindExpr(Call.getOriginExpr(), C.getLocationContext(), g_value);
C.addTransition(State);
return true;
}
bool checkerTest::evalCall(const CallEvent &Call, CheckerContext &C) const {
const FnHandler *Handler = Functions.lookup(Call);
if (Handler) {
return (this->**Handler)(C, Call);
}
return false;
}
void ento::registercheckerTest(CheckerManager &mgr) {
mgr.registerChecker<checkerTest>();
}
bool ento::shouldRegistercheckerTest(const CheckerManager &mgr) {
if (mgr.getLangOpts().CPlusPlus)
return true;
return false;
}
Set handler gets the argument value and saves it to a global variable, get binds this value with the call expression.
I know that the right way to save g_value is saving it to a special map but for testing purposes use a global variable.
On this simple test
void set(const int& value);
const int& get();
int main()
{
set(0);
int res = get();
return res;
}
clang crashes due to an assert in evalLoad
Assertion `!isa<NonLoc>(location) && "location cannot be a NonLoc."' failed.
I think that the engine expects the return value to be a reference, location of a symbol, and g_value is the value itself.
How to get the location of the returned value and return it?
I used clang-16.
The assertion is triggered because the line:
State = State->BindExpr(Call.getOriginExpr(), C.getLocationContext(), g_value);
binds an integer value (zero) to the location associated with a reference,
namely the return value of get()
.
ExprEngine
sees that the call to get()
is subject to lvalue-to-rvalue conversion
and therefore expects that, if it is bound, it should be bound to a
semantic value that represents a location (not an integer), since
lvalues are represented by locations.
How to get the location of the returned value and return it?
The location of the returned value should be a
MemRegionVal
referring to a
SymbolicRegion
.
This way it names an abstract location representing whatever the
returned reference refers to.
The symbol, represented by a
SymExpr
,
could be a fresh ("conjured") one for each call site, and that is the
simplest approach, but means that the analysis will treat each call as
returning a reference to a different memory location, which might not be
what you want.
Nevertheless, for the simple approach, code to do it is:
// Get the type, preserving reference-ness.
QualType resultType = Call.getResultType();
// Conjure a symbolic memory region to represent what the reference
// points to.
SVal getReferentVal =
C.getSValBuilder().getConjuredHeapSymbolVal(
expr,
C.getLocationContext(),
resultType,
C.blockCount());
ProgramStateRef State = C.getState();
// Bind the call expression to the fresh location.
State = State->BindExpr(expr, C.getLocationContext(), getReferentVal);
// Bind the location to the previously saved value.
State = State->bindLoc(getReferentVal.castAs<Loc>(), g_value, C.getLocationContext());
// Update the abstract state with those new bindings.
C.addTransition(State);
This uses getConjuredHeapSymbolVal
to obtain a fresh symbolic memory
region that (as it happens) is assumed to be somewhere on the heap.
I'm using the four-argument overload since the three-argument form would
fail an assertion related to the type of the expression. The
getResultType()
call ensures resultType
is a ReferenceType
,
whereas the reference-ness is lost when calling expr->getType()
.
Then, we bind the call expression to the conjured location, and in turn
bind the location to the value that was previously stashed in g_value
.
Consequently, it can be retrieved from State
by calling getSVal
on
the expression, and then again on the returned location.
To improve the accuracy, you probably want to create a fresh location
only if the expression is not already bound to one, since that will
model the case where multiple calls to get()
return the same thing.