I want to find setter and getter methods in a code base using a Clang AST matcher expression. For example, this code has one getter and one setter to report:
struct S {
int m_x;
int getX() // getter
{
return m_x;
}
void setX(int x) // setter
{
m_x = x;
}
};
For both kinds of method, I want to check that the body has a single statement. A getter body should be a return statement with the return value of a class member. A setter body should be an assignment statement that assigns the passed parameter to a class member. (This may not be a sufficiently tolerant set of criteria in practice, but I'd be satisfied with this as a first approximation.)
I found that I can use the cxxMethodDecl
matcher to find all methods,
but it's not clear how to dig into the bodies or to check the various
properties. None of the examples in the documentation (linked above) do
this; the closest seems to be the example:
cxxConstructorDecl(
isCopyConstructor()
).bind("prepend_explicit")
but that appears to rely on the existence of the primitive matcher
isCopyConstructor
to classify the method body (or perhaps just its
signature? the documentation does not say what any of them actually do),
and there is no isSetter
or similar.
How can I write a match expression to find all the setters and getters in a translation unit?
I interpret the question asking for a
Clang AST Matcher
that will report setters and getters. Such a matcher can be tested at
the command line using
clang-query
.
The following shell script contains a match expression that will find at least some cases of such functions (depending on exactly how they are written). The comments explain what each part does, so it should be feasible to adjust as needed:
#!/bin/sh
PATH=/d/opt/clang+llvm-18.1.8-msvc/bin:$PATH
matcher='
cxxMethodDecl( # Report C++ method declarations
hasBody( # where the body
compoundStmt( # is a compound statement
statementCountIs(1), # with one contained statement
hasAnySubstatement( # that
anyOf( # is either:
returnStmt( # (1) a return statement
hasReturnValue( # whose return value
implicitCastExpr( # is an implicit conversion
hasSourceExpression( # of
memberExpr( # a class member
hasObjectExpression( # of the object
cxxThisExpr() # `*this`, or
)
)
)
)
)
),
binaryOperator( # (2) is a binary expression
isAssignmentOperator(), # using the assignment operator
hasLHS( # where the left-hand side
memberExpr( # is a class member
hasObjectExpression( # of the object
cxxThisExpr() # `*this`, and
)
)
),
hasRHS( # where the right-hand side
implicitCastExpr( # is an implicit conversion
hasSourceExpression( # of
declRefExpr( # a reference to a declaration
hasDeclaration( # of
parmVarDecl() # a parameter.
)
)
)
)
)
)
)
)
)
)
)
'
clang-query \
-c "m $matcher" \
test.cc --
# EOF
To test this matcher, I used this test case:
// test.cc
// Testcases for a matcher to find accessor methods.
struct S {
int m_x;
int getX()
{
return m_x;
}
void setX(int x)
{
m_x = x;
}
};
// EOF
which has this AST:
$ clang -fsyntax-only -Xclang -ast-dump test.cc
TranslationUnitDecl 0x23df46300a0 <<invalid sloc>> <invalid sloc>
|-CXXRecordDecl 0x23df4630900 <<invalid sloc>> <invalid sloc> implicit struct _GUID
| `-TypeVisibilityAttr 0x23df46309b0 <<invalid sloc>> Implicit Default
|-TypedefDecl 0x23df4630a28 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x23df4630670 '__int128'
|-TypedefDecl 0x23df4630a98 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x23df4630690 'unsigned __int128'
|-TypedefDecl 0x23df4630e40 <<invalid sloc>> <invalid sloc> implicit __NSConstantString '__NSConstantString_tag'
| `-RecordType 0x23df4630b80 '__NSConstantString_tag'
| `-CXXRecord 0x23df4630af0 '__NSConstantString_tag'
|-CXXRecordDecl 0x23df4630e98 <<invalid sloc>> <invalid sloc> implicit class type_info
| `-TypeVisibilityAttr 0x23df4630f50 <<invalid sloc>> Implicit Default
|-TypedefDecl 0x23df4630fc8 <<invalid sloc>> <invalid sloc> implicit size_t 'unsigned long long'
| `-BuiltinType 0x23df4630290 'unsigned long long'
|-TypedefDecl 0x23df466b568 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *'
| `-PointerType 0x23df4631020 'char *'
| `-BuiltinType 0x23df4630150 'char'
|-TypedefDecl 0x23df466b5d8 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'char *'
| `-PointerType 0x23df4631020 'char *'
| `-BuiltinType 0x23df4630150 'char'
`-CXXRecordDecl 0x23df466b630 <test.cc:4:1, line:16:1> line:4:8 struct S definition
|-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable pod trivial literal
| |-DefaultConstructor exists trivial needs_implicit
| |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| |-MoveConstructor exists simple trivial needs_implicit
| |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param
| |-MoveAssignment exists simple trivial needs_implicit
| `-Destructor simple irrelevant trivial needs_implicit
|-CXXRecordDecl 0x23df466b748 <col:1, col:8> col:8 implicit struct S
|-FieldDecl 0x23df466b7f0 <line:5:3, col:7> col:7 referenced m_x 'int'
|-CXXMethodDecl 0x23df466b8d8 <line:7:3, line:10:3> line:7:7 getX 'int ()' implicit-inline
| `-CompoundStmt 0x23df466bba0 <line:8:3, line:10:3>
| `-ReturnStmt 0x23df466bb90 <line:9:5, col:12>
| `-ImplicitCastExpr 0x23df466bb78 <col:12> 'int' <LValueToRValue>
| `-MemberExpr 0x23df466bb48 <col:12> 'int' lvalue ->m_x 0x23df466b7f0
| `-CXXThisExpr 0x23df466bb38 <col:12> 'S *' implicit this
`-CXXMethodDecl 0x23df466ba70 <line:12:3, line:15:3> line:12:8 setX 'void (int)' implicit-inline
|-ParmVarDecl 0x23df466b998 <col:13, col:17> col:17 used x 'int'
`-CompoundStmt 0x23df466bc98 <line:13:3, line:15:3>
`-BinaryOperator 0x23df466bc78 <line:14:5, col:11> 'int' lvalue '='
|-MemberExpr 0x23df466bc10 <col:5> 'int' lvalue ->m_x 0x23df466b7f0
| `-CXXThisExpr 0x23df466bc00 <col:5> 'S *' implicit this
`-ImplicitCastExpr 0x23df466bc60 <col:11> 'int' <LValueToRValue>
`-DeclRefExpr 0x23df466bc40 <col:11> 'int' lvalue ParmVar 0x23df466b998 'x' 'int'
The script produces this output:
Match #1:
$PWD\test.cc:7:3: note: "root"
binds here
7 | int getX()
| ^~~~~~~~~~
8 | {
| ~
9 | return m_x;
| ~~~~~~~~~~~
10 | }
| ~
Match #2:
$PWD\test.cc:12:3: note: "root"
binds here
12 | void setX(int x)
| ^~~~~~~~~~~~~~~~
13 | {
| ~
14 | m_x = x;
| ~~~~~~~~
15 | }
| ~
2 matches.
The procedure for creating the matcher was to basically follow the AST
dump line by line, turning each of the elements I want to match into its
corresponding matcher. In some cases that is straightforward (for
example, the CXXMethodDecl
AST node is matched by the cxxMethodDecl
matcher), while for others I had to do some text searching in the
matcher reference, along with trial-and-error, to find the right
combination.