Retrieving and storing metadata about C++ entities using libtooling

Diclaimer: I'm newbie in libtooling.

I want to retrieve metadata about all C++ entities (such as classes and class templates) from source code and store it for later processing.

I retrieve AST nodes as described in the LibTooling and LibASTMatchers tutorial:

struct Handler : MatchFinder::MatchCallback
{
    void run(const MatchFinder::MatchResult& result) override
    {
        if (const auto entity = result.Nodes.getNodeAs<clang::CXXRecordDecl>("rec"))
        { /* ? */ }
    }
};

// Create parser, parse command-line args

auto classMatcher = clang::ast_matchers::cxxRecordDecl().bind("rec");
ClangTool tool { parser->getCompilations(), parser->getSourcePathList() };
Handler handler;
MatchFinder finder;
finder.addMatcher(classMatcher, &handler);
tool.run(newFrontendActionFactory(&finder).get());

, there are no problems with this.

But I can't decide how to store them. I see two options: create wrappers for each entity (type, parameter, namespace, class, e.t.c.) and use raw pointers to <Entity>Decl.

The first way seems fine, but I will have to write a lot of wrappers for what LibTooling already provides.

As for the second way: I tried it and got errors - it seems AST context is destroyed after the ClangTool::run is completed, and I can't use raw pointers to <Entity>Decl nodes.

How do those who use LibTooling usually do this? Create wrappers? Or is there some way to preserve the AST context and continue to use raw pointers to nodes? Maybe some third way?

Solution

I think you are asking how to run the Clang parser to get an AST, and then (still within the same process) analyze that AST outside of the ClangTool framework, which imposes certain requirements about when the analysis code runs since it automatically destroys the ASTContext. (For example, using ClangTool, it would be difficult to parse two unrelated translation units and then compare them against each other.)

If so, the main alternative method I know if is to use the ASTUnit system instead of ClangTool. ASTUnit combines an ASTContext and the AST itself (with TranslationUnitDecl as the root) into a single object. It has several static methods to create one, such as LoadFromCommandLine. Once created, the ASTUnit and its components can be used and inspected indefinitely; it's up to you (the library client) when to choose to destroy it. You can also have many of them in memory simultaneously (subject to total available memory, of course). Consequently, your "second method" of directly using Clang AST pointers within your analysis will work fine.

There is an example that uses ASTUnit::LoadFromCommandLine in this answer of mine.

If instead your goal is to serialize the AST for analysis in a later process, ASTUnit again offers some methods to read and write the AST. But be aware that while a freshly-parsed translation unit has access to the original source code, the serialized AST does not store that source code, so it might not be available later.