I'm writing a clang static analyzer checker for this class:
class A {
int member_;
public:
void set(const int& value);
const int& get();
};
Real set implementation saves passed value to an internal variable of type int, get returns the reference to this variable.
This is my implementation:
#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
using namespace clang;
using namespace ento;
namespace {
class checkerObjTest : public Checker<eval::Call> {
bool handleSet(CheckerContext &C, const CallEvent &Call) const;
public:
bool evalCall(const CallEvent &Call, CheckerContext &C) const;
using FnHandler = bool (checkerObjTest::*)(CheckerContext &, const CallEvent &Call) const;
CallDescriptionMap<FnHandler> Functions = {
{{{"set"}, 1}, &checkerObjTest::handleSet},
};
};
} // namespace
bool checkerObjTest::handleSet(CheckerContext &C, const CallEvent &Call) const {
const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
if (!CE)
return false;
const CXXInstanceCall *InstCall = dyn_cast<CXXInstanceCall>(&Call);
if (!InstCall)
return false;
// Conversion to CXXThisExpr returns null in my example, conversion to
// Expr returns a valid pointer.
//const CXXThisExpr *TE =
dyn_cast_or_null<CXXThisExpr>(InstCall->getCXXThisExpr());
const Expr *TE = dyn_cast_or_null<Expr>(InstCall->getCXXThisExpr());
if (!TE)
return false;
unsigned Count = C.blockCount();
SValBuilder &svalBuilder = C.getSValBuilder();
const LocationContext *LCtx = C.getPredecessor()->getLocationContext();
// Create memory region for object data
// Both of these calls trigger a Loc::isLocType(type) assertion.
DefinedSVal innerDataVal = svalBuilder.getConjuredHeapSymbolVal(TE, LCtx, Count).castAs<DefinedSVal>();
//DefinedSVal innerDataVal = svalBuilder.getConjuredHeapSymbolVal(CE, LCtx, Count).castAs<DefinedSVal>();
return true;
}
bool checkerObjTest::evalCall(const CallEvent &Call, CheckerContext &C) const {
const FnHandler *Handler = Functions.lookup(Call);
if (Handler) {
return (this->**Handler)(C, Call);
}
return false;
}
void ento::registercheckerObjTest(CheckerManager &mgr) {
mgr.registerChecker<checkerObjTest>();
}
bool ento::shouldRegistercheckerObjTest(const CheckerManager &mgr) {
if (mgr.getLangOpts().CPlusPlus)
return true;
return false;
}
On this simple test
class A {
int member_;
public:
void set(const int& value);
const int& get();
};
int main()
{
A a;
int data = 0;
a.set(data);
int res = a.get();
return res;
}
clang crashes due to an assert in SValBuilder::getConjuredHeapSymbolVal
:
Assertion `Loc::isLocType(type)' failed.
Looks like the engine expects the first argument to getConjuredHeapSymbolVal
to be a location. I tried expressions for call
and this
, both asserting.
I expected that at least this
should represent a location but it is not.
What location should I use in this example or is it possible to create a HeapSymbol without passing a location?
I tried to model object members as a heap region with elements at given offsets, like malloc and array accesses to allocated memory are working. Maybe this method is wrong for modeling internal states of objects and there is a right one?
I used clang-16.
The reason for the failed assertion:
Assertion `Loc::isLocType(type)' failed.
is that, in the call:
DefinedSVal innerDataVal =
svalBuilder.getConjuredHeapSymbolVal(TE, LCtx, Count).castAs<DefinedSVal>();
the argument TE
is the expression a
, of type class A
, but
getConjuredHeapSymbolVal
insists that it be given an argument of
pointer or reference type, since it returns a semantic value denoting
a storage location to which such a pointer or reference refers.
This is perhaps surprising because TE
is set on this line:
const Expr *TE = dyn_cast_or_null<Expr>(InstCall->getCXXThisExpr());
and the this
expression in the C++ language is a pointer. But if we
dig into getCXXThisExpr
:
const Expr *CXXMemberCall::getCXXThisExpr() const {
return getOriginExpr()->getImplicitObjectArgument();
}
and then getImplicitObjectArgument
:
Expr *CXXMemberCallExpr::getImplicitObjectArgument() const {
const Expr *Callee = getCallee()->IgnoreParens();
if (const auto *MemExpr = dyn_cast<MemberExpr>(Callee))
return MemExpr->getBase();
...
}
we see that it returns (in this case) MemberExpr::getBase()
, which is
the left hand side of the member access expression a.set
, hence just
a
. The getCXXThisExpr
method has an unfortunately misleading name.
There are a couple ways to address this, including making a pointer type
out of the type of TE
and then using the four-argument overload of
getConjuredHeapSymbolVal
(which allows a type to be separately
specified), but the simplest is to use
CXXInstanceCall::getCXXThisVal()
,
which returns a semantic value for the location pointed to by this
if
one is known:
DefinedSVal innerDataVal = InstCall->getCXXThisVal().castAs<DefinedSVal>();
For this example, it returns a semantic value referring to the location
of the local variable a
.
However, from reading the code, it appears getCXXThisVal()
can
return an undefined value if the object expression hasn't already been
assigned a location, so depending on the goals of the checker, it may
be necessary to conjure one in some cases. To do so, as mentioned
above, make sure the expression passed has pointer type, or that you
separately pass a pointer or reference type to the four-argument
overload. This answer of mine,
to a related question, shows how to do the latter in more detail.