I am developing a boost wave applicaton to map unexpanded macro instantiations in a 'C' source file to the corresponding expanded macro text.
The mapping needs to work for both function & object-like macro invocations. In the case of function-like macros, the mapping should handle nested macro invocations in the macro arguments.
Consider the following macro definitions (either defined at the start of a 'C' file or included via an included header):
#define MIN(a, b) (((a) <= (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
In the example below, we have 'C' code that calls the MIN
macro 2 times. Each invocation occurance is unique (corresponding to a particular line in the source (lines 4 & 5) and these are also examples of nested macro calls. The expanded macro text (with whitespace removed for clarity) is captured in the macro_rescanned
callback (((1)<=((((3)>(4))?(3):(4))))?(a):((((3)>(4))?(3):(4))))
. I do not know where to find which instance (and corresponding original source tokens) it is being expanded from (line 4 or 5).
1 #include "..." // definitions of MIN/MAX macros
2 int main()
3 {
4 int foo = MIN(1,MAX(3,4)); // line 4, col 15 file position (15 chars)
^-------------- // HELP: how to get this range & corresponding from line 4.
5 int bar = MIN(1,MAX(3,7)); // line 5, col 15 file position (15 chars)
^-------------- // HELP: how to get this range & corresponding from line 5.
6 return 0;
7 }
boost::wave allows an application to extend the [default_preprocessing_hooks][2]
class and overwite a few methods to intercept the macro expansion process. The macro expansion process is described here. The functions of interest are:
template <typename ContextT, typename TokenT, typename ContainerT>
bool expanding_object_like_macro(
ContextT const& ctx,
TokenT const& macro,
ContainerT const& macrodef,
TokenT const& macrocall);
template <typename ContextT, typename TokenT, typename ContainerT, typename IteratorT>
bool expanding_function_like_macro(
ContextT const& ctx, TokenT const& macrodef,
std::vector<TokenT> const& formal_args,
ContainerT const& definition, TokenT const& macrocall,
std::vector<ContainerT> const& arguments,
IteratorT const& seqstart, IteratorT const& seqend)
template <typename ContextT, typename ContainerT>
void expanded_macro(ContextT const &ctx, ContainerT const &result)
template <typename ContextT, typename ContainerT>
void expanded_macro(ContextT const &ctx, ContainerT const &result)
After the macro is fully expanded (in the case of nested macros, when ALL the arguments have been expanded), the following hook gets called with the fully expanded macro replacement text. This is the text that I need to capture, however I also need to know the corresponding unexpanded locations and original text from the C source file (like lines 3 & 4 in the introductary example).
This hook is particularly interesting in that it only gets called after all the macro arguments have also been fully expanded. From what I can tell as each of the nested arguments are expanded, the expanded_macro
hook is also called repeatedly while processng arguments from left to right. Unfortunately however, I cannot find a a way of associating the original expanded macro text & source file location range with the expanded text contained in the tokens parameter:
template <typename ContextT, typename ContainerT>
void rescanned_macro(ContextT const &ctx, ContainerT const &tokens)
I based my boost::wave application on the advanced hooks example that ships with the boost::wave library.
The application uses a C source file c:\temp\test2.c
as test data. This is passed in as the only argument to the application. This is a very simple file that does not include any other files.
#define TWO (2) // object like macro
#define THREE() (3) // function like macro with 0 args
#define FOUR() (4) // function like macro with 0 args
#define NUMSQUARED(x) ((x)*(x)) // function like macro with 1 arg
#define MIN(a, b) (((a) <= (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
#define FUNC_MACRO(x) ((x) + 1)
#define NESTED_MACRO(a, b) (FUNC_MACRO(a) + NUMSQUARED(b) + FUNC_MACRO(FOUR()) + TWO + THREE())
int main() {
int a = NESTED_MACRO(1, 2);
int b = MIN(1, TWO);
int c = MIN(1, 2);
int d = MIN(1, THREE());
int f = MIN(1, NUMSQUARED(3));
int g = MIN(MAX(1, 2), 3);
return 1;
}
#include <string>
#include <vector>
#include <iostream>
#include <filesystem>
#include <boost/wave.hpp>
#include <boost/wave/cpplexer/cpp_lex_token.hpp>
#include <boost/wave/cpplexer/cpp_lex_iterator.hpp>
using namespace boost::wave;
namespace fs = std::filesystem;
struct my_hooks : public context_policies::default_preprocessing_hooks {
my_hooks(const my_hooks& other) = default;
my_hooks(my_hooks&& other) noexcept = default;
my_hooks& operator=(const my_hooks& other) = default;
my_hooks& operator=(my_hooks&& other) noexcept = default;
explicit my_hooks()
: mSourcePath{}
, mCurrentMacro{}
, mExpandedMacros{}
{}
~my_hooks() {
if (!mExpandedMacros.empty()) {
std::cout
<< "-----------------------------------\n"
<< "~my_hooks printing mExpandedMacros:\n"
<< "-----------------------------------\n";
// print out all the macros
for (const auto& [key, value] : mExpandedMacros) {
const auto& [start, end, text] = value;
std::stringstream tmp;
tmp << start << " .. " << end;
std::string positionInfo = tmp.str();
std::cout << "Expanded macro: " << key << '\n';
std::cout << "Expanded range: [" << positionInfo << "]\n";
}
std::cout
<< "-----------------------------------\n";
}
}
template <typename ContextT, typename TokenT, typename ContainerT>
bool expanding_object_like_macro(
ContextT const& ctx,
TokenT const& macro,
ContainerT const& macrodef,
TokenT const& macrocall) {
mCurrentMacro = macrocall.get_value().c_str();
// only interested in macros from the current file
if (mSourcePath == fs::path(macrocall.get_position().get_file().c_str())) {
const auto& callPos = macrocall.get_position();
std::string rowCol = std::to_string(
callPos.get_line()) + ':' +
std::to_string(callPos.get_column());
const std::string key = mCurrentMacro + ":" +
fs::path(callPos.get_file().c_str()).string() +
':' + rowCol;
// adjust the ending position
auto endPos = callPos;
endPos.set_column(endPos.get_column() +
mCurrentMacro.size());
std::string expanded;
for (auto const& token : macrodef) {
expanded += token.get_value().c_str();
}
mExpandedMacros[key] = std::make_tuple(
callPos, endPos, expanded);
// continue with default processing
return false;
}
// do not process further
return true;
}
template <typename ContextT, typename TokenT, typename ContainerT, typename IteratorT>
bool expanding_function_like_macro(
ContextT const& ctx,
TokenT const& macrodef,
std::vector<TokenT> const& formal_args,
ContainerT const& definition,
TokenT const& macrocall,
std::vector<ContainerT> const& arguments,
IteratorT const& seqstart,
IteratorT const& seqend)
{
mCurrentMacro = macrocall.get_value().c_str();
// only interested in macros expanding into the current file
if (mSourcePath == fs::path(macrocall.get_position().get_file().c_str())) {
const auto& callPos = macrocall.get_position();
std::string rowCol = std::to_string(
callPos.get_line()) + ':' +
std::to_string(callPos.get_column());
const std::string key = mCurrentMacro + ":" +
fs::path(callPos.get_file().c_str()).string() +
':' + rowCol;
mExpandedMacros[key] = std::make_tuple(
callPos, seqend->get_position(), "");
// continue with default processing
return false;
}
// do not process further
return true;
}
template <typename ContextT, typename ContainerT>
void expanded_macro(ContextT const &ctx, ContainerT const &result) {
std::string expanded;
for (auto const& token : result) {
expanded += token.get_value().c_str();
}
// clean up the macro expansion text - removing
// multiple lines & extra unnecessary whitespace
std::erase(expanded, '\n');
auto end = std::unique(
expanded.begin(), expanded.end(), [](auto lhs, auto rhs) {
return (lhs == rhs) && ((lhs == ' ') || (lhs == '\t'));
});
expanded.erase(end, expanded.end());
mExpandedMacros1[mCurrentMacro] = expanded;
}
template <typename ContextT, typename ContainerT>
void rescanned_macro(ContextT const &ctx, ContainerT const &tokens) {
const auto& expansionPos = tokens.begin()->get_position();
if (mSourcePath == expansionPos.get_file().c_str()) {
std::ostringstream oss;
std::string expanded;
typename ContainerT::const_iterator prev;
const auto startPos = tokens.begin()->get_position();
for (typename ContainerT::const_iterator iter = tokens.begin();
iter != tokens.end(); ++iter) {
prev = iter;
expanded += iter->get_value().c_str();
}
const auto endPos = prev->get_position();
std::stringstream tmp;
tmp << startPos << " .. " << endPos;
std::string expandedPositionInfo = tmp.str();
// compress expanded text to a single line (removing unnecessary whitespace)
std::erase(expanded, '\n');
auto end = std::unique(
expanded.begin(), expanded.end(), [](auto lhs, auto rhs) {
return (lhs == rhs) && ((lhs == ' ') || (lhs == '\t'));
});
expanded.erase(end, expanded.end());
oss << "Expanded macro: " << expanded << '\n';
oss << "Expanded range: [" << expandedPositionInfo << "]\n";
std::cout << oss.str();
}
}
// Expansion key consists of macro name and the
// starting location where it is invoked in the sourcePath
using ExpansionInfo = std::tuple<
util::file_position_type,
util::file_position_type,
std::string>;
fs::path mSourcePath;
std::string mCurrentMacro;
std::map<std::string, ExpansionInfo> mExpandedMacros;
std::map<std::string, std::string> mExpandedMacros1;
};
/** Main entry point */
int
main(int argc, char *argv[])
{
using namespace boost::wave;
if (argc < 2) {
std::cerr << "Usage: expand_macros [input file]" << '\n';
return -1;
}
// current file position is saved for exception handling
util::file_position_type current_position;
try {
// Open and read in the specified input file.
std::ifstream instream(argv[1]);
if (!instream.is_open()) {
std::cerr
<< "Could not open input file: "
<< argv[1]
<< '\n';
return -2;
}
instream.unsetf(std::ios::skipws);
std::string instring = std::string(
std::istreambuf_iterator<char>(instream.rdbuf()),
std::istreambuf_iterator<char>());
// The template boost::wave::cpplexer::lex_token<> is
// the token type to be used by the Wave library.
using token_type = cpplexer::lex_token<>;
// The template boost::wave::cpplexer::lex_iterator<> is the
// iterator type to be used by the Wave library.
using lex_iterator_type = cpplexer::lex_iterator<token_type>;
// This is the resulting context type to use. The first template parameter
// should match the iterator type to be used during construction of the
// corresponding context object (see below).
using context_type = context<std::string::iterator, lex_iterator_type,
iteration_context_policies::load_file_to_string, my_hooks>;
// The preprocessor iterator shouldn't be constructed directly. It is
// to be generated through a wave::context<> object. This wave:context<>
// object additionally may be used to initialize and define different
// parameters of the actual preprocessing (not done here).
//
// The preprocessing of the input stream is done on the fly behind the
// scenes during iteration over the context_type::iterator_type stream.
context_type ctx (instring.begin(), instring.end(), argv[1]);
ctx.get_hooks().mSourcePath = fs::path(argv[1]);
// This is where we add the project include paths
std::vector<std::string> includePaths = {
"C:/Users/johnc/main/tcdu-cdu/include",
"C:/Users/johnc/main/tcdu-cdu/src/cdu/include"
};
// These include paths are part of the compiler toolchain, note that these
// include paths allow for either VS2022 preview or Community to be present.
// Also, the apex folder is added here as it should be on the system
// include path list.
std::vector<std::string> systemIncludePaths = {
"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/include",
"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/atlmfc/include",
"C:/Program Files/Microsoft Visual Studio/2022/Preview/VC/Tools/MSVC/14.41.33923/include",
"C:/Program Files/Microsoft Visual Studio/2022/Preview/VC/Tools/MSVC/14.41.33923/atlmfc/include",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/ucrt",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/shared",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/um",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/winrt",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/cppwinrt",
"C:/Users/johnc/main/tcdu-cdu/include/apex"
};
// Copied from visual studio preprocessor settings.
// Not sure why RC_INVOKED is required.
std::vector<std::string> preprocessorDefines = {
"_UNICODE",
"UNICODE",
"_CRT_SECURE_NO_WARNINGS",
"WIN32_LEAN_AND_MEAN",
"UNIT_TEST=1",
"RC_INVOKED",
};
// set various options
for (const auto& next : includePaths) {
ctx.add_include_path(next.data());
}
for (const auto& next : systemIncludePaths) {
ctx.add_sysinclude_path(next.data());
}
for (const auto& next : preprocessorDefines) {
ctx.add_macro_definition(next.data());
}
ctx.set_language(boost::wave::support_cpp2a);
ctx.set_language(enable_preserve_comments(ctx.get_language()));
ctx.set_language(enable_prefer_pp_numbers(ctx.get_language()));
ctx.set_language(enable_single_line(ctx.get_language()));
// Analyze the input file, print out the preprocessed tokens
context_type::iterator_type first = ctx.begin();
context_type::iterator_type last = ctx.end();
// process all file tokens
while (first != last) {
// do not print out preprocessed tokens
//std::cout << first->get_value();
++first;
}
std::cout
<< "---------------------------------------\n"
<< "Context macros after processing source:\n"
<< "---------------------------------------\n";
// the context should have all the macros.
for (auto it = ctx.macro_names_begin();
it != ctx.macro_names_end(); ++it) {
typedef std::vector<context_type::token_type> parameters_type;
bool has_pars = false;
bool predef = false;
context_type::position_type pos;
parameters_type pars;
context_type::token_sequence_type def;
std::cout << "Macro: " << *it << '\n';
if (ctx.get_macro_definition(*it, has_pars, predef, pos, pars, def)) {
// if has_pars is true, you can iterate through pars to see
// parameters for function macros
// iterate through def to see the macro definition
}
}
std::cout << "---------------------------------------\n";
} catch (boost::wave::cpp_exception const& e) {
// some preprocessing error
std::cerr
<< e.file_name()
<< "(" << e.line_no() << "): "
<< e.description()
<< '\n';
return 2;
} catch (std::exception const& e) {
// Use last recognized token to retrieve the error position
std::cerr
<< current_position.get_file()
<< "(" << current_position.get_line() << "): "
<< "exception caught: " << e.what()
<< '\n';
return 3;
} catch (...) {
// use last recognized token to retrieve the error position
std::cerr
<< current_position.get_file()
<< "(" << current_position.get_line() << "): "
<< "unexpected exception caught."
<< '\n';
return 4;
}
return 0;
}
Since I am only interested in macro expansions originating from the filename passed into the application, I needed to add a std::filesystem::path
member to my_hook
class. While processing the tokens from the source file the hooks mentioned earlier get called at various times.
The expanding_function_like_macro
shows how I handle function like macros. The callback will be called multiple times while expanding a single nested function-like macro (like the MIN/MAX example shown earlier). Once the last argument has been expanded (by again calling expanding_function_like_macro
or expanding_object_like_macro
), the rescanned_macro
callback will be finally called containing the entire nested macro expression expansion in the tokens argument.
As macros are being expanded, I track the each expansion as a a combination of the macro name and the location in the source where it is invoked. This is the key that I use in the mExpandedMacros
member.
The program output is in 3 parts
The macro expansion while expanding the macros:
The macros stored in the context after processing the file.
The my_hooks destructor which prints out the contents of the mExpandedMacros
Expanded macro: ((1) + 1)
Expanded range: [C:/temp/test2.c:7:33 .. C:/temp/test2.c:7:41]
Expanded macro: (( 2)*( 2))
Expanded range: [C:/temp/test2.c:4:33 .. C:/temp/test2.c:4:41]
Expanded macro: (4)
Expanded range: [C:/temp/test2.c:3:33 .. C:/temp/test2.c:3:35]
Expanded macro: (((4)) + 1)
Expanded range: [C:/temp/test2.c:7:33 .. C:/temp/test2.c:7:41]
Expanded macro: (2)
Expanded range: [C:/temp/test2.c:1:33 .. C:/temp/test2.c:1:35]
Expanded macro: (3)
Expanded range: [C:/temp/test2.c:2:33 .. C:/temp/test2.c:2:35]
Expanded macro: (((1) + 1) + (( 2)*( 2)) + (((4)) + 1) + (2) + (3))
Expanded range: [C:/temp/test2.c:8:33 .. C:/temp/test2.c:8:100]
Expanded macro: (2)
Expanded range: [C:/temp/test2.c:1:33 .. C:/temp/test2.c:1:35]
Expanded macro: (((1) <= ( (2))) ? (1) : ( (2)))
Expanded range: [C:/temp/test2.c:5:33 .. C:/temp/test2.c:5:58]
Expanded macro: (((1) <= ( 2)) ? (1) : ( 2))
Expanded range: [C:/temp/test2.c:5:33 .. C:/temp/test2.c:5:58]
Expanded macro: (3)
Expanded range: [C:/temp/test2.c:2:33 .. C:/temp/test2.c:2:35]
Expanded macro: (((1) <= ( (3))) ? (1) : ( (3)))
Expanded range: [C:/temp/test2.c:5:33 .. C:/temp/test2.c:5:58]
Expanded macro: ((3)*(3))
Expanded range: [C:/temp/test2.c:4:33 .. C:/temp/test2.c:4:41]
Expanded macro: (((1) <= ( ((3)*(3)))) ? (1) : ( ((3)*(3))))
Expanded range: [C:/temp/test2.c:5:33 .. C:/temp/test2.c:5:58]
Expanded macro: (((1) > ( 2)) ? (1) : ( 2))
Expanded range: [C:/temp/test2.c:6:33 .. C:/temp/test2.c:6:58]
Expanded macro: ((((((1) > ( 2)) ? (1) : ( 2))) <= ( 3)) ? ((((1) > ( 2)) ? (1) : ( 2))) : ( 3))
Expanded range: [C:/temp/test2.c:5:33 .. C:/temp/test2.c:5:58]
---------------------------------------
Context macros after processing source:
---------------------------------------
Macro: FOUR
Macro: FUNC_MACRO
Macro: MAX
Macro: MIN
Macro: NESTED_MACRO
Macro: NUMSQUARED
Macro: THREE
Macro: TWO
Macro: __BASE_FILE__
Macro: __DATE__
Macro: __SPIRIT_PP_VERSION_STR__
Macro: __SPIRIT_PP_VERSION__
Macro: __SPIRIT_PP__
Macro: __STDC_HOSTED__
Macro: __STDC_VERSION__
Macro: __STDC__
Macro: __TIME__
Macro: __WAVE_CONFIG__
Macro: __WAVE_HAS_VARIADICS__
Macro: __WAVE_VERSION_STR__
Macro: __WAVE_VERSION__
Macro: __WAVE__
Macro: __cplusplus
---------------------------------------
-----------------------------------
~my_hooks printing mExpandedMacros:
-----------------------------------
Expanded macro: FOUR:C:/temp/test2.c:8:77
Expanded range: [C:/temp/test2.c:8:77 .. C:/temp/test2.c:8:82]
Expanded macro: FUNC_MACRO:C:/temp/test2.c:8:34
Expanded range: [C:/temp/test2.c:8:34 .. C:/temp/test2.c:8:46]
Expanded macro: FUNC_MACRO:C:/temp/test2.c:8:66
Expanded range: [C:/temp/test2.c:8:66 .. C:/temp/test2.c:8:83]
Expanded macro: MAX:C:/temp/test2.c:15:17
Expanded range: [C:/temp/test2.c:15:17 .. C:/temp/test2.c:15:25]
Expanded macro: MIN:C:/temp/test2.c:11:13
Expanded range: [C:/temp/test2.c:11:13 .. C:/temp/test2.c:11:23]
Expanded macro: MIN:C:/temp/test2.c:12:13
Expanded range: [C:/temp/test2.c:12:13 .. C:/temp/test2.c:12:21]
Expanded macro: MIN:C:/temp/test2.c:13:13
Expanded range: [C:/temp/test2.c:13:13 .. C:/temp/test2.c:13:27]
Expanded macro: MIN:C:/temp/test2.c:14:13
Expanded range: [C:/temp/test2.c:14:13 .. C:/temp/test2.c:14:33]
Expanded macro: MIN:C:/temp/test2.c:15:13
Expanded range: [C:/temp/test2.c:15:13 .. C:/temp/test2.c:15:29]
Expanded macro: NESTED_MACRO:C:/temp/test2.c:10:13
Expanded range: [C:/temp/test2.c:10:13 .. C:/temp/test2.c:10:30]
Expanded macro: NUMSQUARED:C:/temp/test2.c:14:20
Expanded range: [C:/temp/test2.c:14:20 .. C:/temp/test2.c:14:32]
Expanded macro: NUMSQUARED:C:/temp/test2.c:8:50
Expanded range: [C:/temp/test2.c:8:50 .. C:/temp/test2.c:8:62]
Expanded macro: THREE:C:/temp/test2.c:13:20
Expanded range: [C:/temp/test2.c:13:20 .. C:/temp/test2.c:13:26]
Expanded macro: THREE:C:/temp/test2.c:8:93
Expanded range: [C:/temp/test2.c:8:93 .. C:/temp/test2.c:8:99]
Expanded macro: TWO:C:/temp/test2.c:11:20
Expanded range: [C:/temp/test2.c:11:20 .. C:/temp/test2.c:11:23]
Expanded macro: TWO:C:/temp/test2.c:8:87
Expanded range: [C:/temp/test2.c:8:87 .. C:/temp/test2.c:8:90]
-----------------------------------
Made your program into a self-contained online demo:
The second prints:
Expanded macro: (((3) > ( 4)) ? (3) : ( 4))
Expanded range: [input.cpp:2:19 .. input.cpp:2:44]
Expanded macro: (((1) <= ( (((3) > ( 4)) ? (3) : ( 4)))) ? (1) : ( (((3) > ( 4)) ? (3) : ( 4))))
Expanded range: [input.cpp:1:19 .. input.cpp:1:44]
Expanded macro: (((3) > ( 7)) ? (3) : ( 7))
Expanded range: [input.cpp:2:19 .. input.cpp:2:44]
Expanded macro: (((1) <= ( (((3) > ( 7)) ? (3) : ( 7)))) ? (1) : ( (((3) > ( 7)) ? (3) : ( 7))))
Expanded range: [input.cpp:1:19 .. input.cpp:1:44]
---------------------------------------
Context macros after processing source:
---------------------------------------
Macro: MAX
Macro: MIN
Macro: __BASE_FILE__
Macro: __DATE__
Macro: __SPIRIT_PP_VERSION_STR__
Macro: __SPIRIT_PP_VERSION__
Macro: __SPIRIT_PP__
Macro: __STDC_HOSTED__
Macro: __STDC_VERSION__
Macro: __STDC__
Macro: __TIME__
Macro: __WAVE_CONFIG__
Macro: __WAVE_HAS_VARIADICS__
Macro: __WAVE_VERSION_STR__
Macro: __WAVE_VERSION__
Macro: __WAVE__
Macro: __cplusplus
---------------------------------------
-----------------------------------
~my_hooks printing mExpandedMacros:
-----------------------------------
Expanded macro: MAX:input.cpp:4:22
Expanded range: [input.cpp:4:22 .. input.cpp:4:30]
Expanded macro: MAX:input.cpp:5:22
Expanded range: [input.cpp:5:22 .. input.cpp:5:30]
Expanded macro: MIN:input.cpp:4:15
Expanded range: [input.cpp:4:15 .. input.cpp:4:31]
Expanded macro: MIN:input.cpp:5:15
Expanded range: [input.cpp:5:15 .. input.cpp:5:31]
-----------------------------------
Now it looks to me that all the questions labeled "HELP" in the question are answered:
Expanded macro: MAX:input.cpp:4:22 Expanded range: [input.cpp:4:22 .. input.cpp:4:30]
Expanded macro: MAX:input.cpp:5:22 Expanded range: [input.cpp:5:22 .. input.cpp:5:30]
Expanded macro: MIN:input.cpp:4:15 Expanded range: [input.cpp:4:15 .. input.cpp:4:31]
Expanded macro: MIN:input.cpp:5:15 Expanded range: [input.cpp:5:15 .. input.cpp:5:31]
To put it together with the input:
#define MIN(a, b) (((a) <= (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
# 3 "input.cpp"
int main() {
# 4 "input.cpp"
int foo = MIN(1, MAX(3, 4));
// Expanded macro: MAX:input.cpp:4:22 Expanded range: [input.cpp:4:22 .. input.cpp:4:30]
// ^-------^
// Expanded macro: MIN:input.cpp:4:15 Expanded range: [input.cpp:4:15 .. input.cpp:4:31]
// ^---------------^
# 5 "input.cpp"
int bar = MIN(1, MAX(3, 7));
// Expanded macro: MAX:input.cpp:5:22 Expanded range: [input.cpp:5:22 .. input.cpp:5:30]
// ^-------^
// Expanded macro: MIN:input.cpp:5:15 Expanded range: [input.cpp:5:15 .. input.cpp:5:31]
// ^---------------^
# 6 "input.cpp"
}
Though it remains a bit unclear what you are missing in the above output, perhaps you are looking for a way to programmatically correlate the nested ranges?
In that case, you can go about it using the lexical nesting of the source locations
After some toying around, does this approximate your use case?
#include <boost/wave.hpp>
#include <boost/wave/cpplexer/cpp_lex_iterator.hpp>
#include <boost/wave/cpplexer/cpp_lex_token.hpp>
#include <filesystem>
#include <fstream>
#include <iostream>
namespace {
namespace wave = boost::wave;
namespace fs = std::filesystem;
using Position = wave::util::file_position_type;
static inline auto operator<=>(Position const& lhs, Position const& rhs) {
return std::make_tuple(lhs.get_file(), lhs.get_line(), lhs.get_column()) <=>
std::make_tuple(rhs.get_file(), rhs.get_line(), rhs.get_column());
}
static inline std::ostream& operator<<(std::ostream& os, Position const& pos) {
return os << pos.get_file() << ':' << pos.get_line() << ':' << pos.get_column();
}
// The template wave::cpplexer::lex_token<> is
// the token type to be used by the Wave library.
using token_type = wave::cpplexer::lex_token<>;
// The template wave::cpplexer::lex_iterator<> is the
// iterator type to be used by the Wave library.
using lex_iterator_type = wave::cpplexer::lex_iterator<token_type>;
// This is the resulting context type to use. The first template parameter
// should match the iterator type to be used during construction of the
// corresponding context object (see below).
struct my_hooks;
using context_type = wave::context<std::string::const_iterator, lex_iterator_type,
wave::iteration_context_policies::load_file_to_string, my_hooks>;
struct my_hooks : public wave::context_policies::default_preprocessing_hooks {
explicit my_hooks(fs::path sourcePath, std::string const& sourceContent)
: mSourcePath{std::move(sourcePath)}
, mCachedSource(sourceContent) {}
~my_hooks() {
for (auto const& [name, start, end, text] : mExpansions) {
std::cout << "Expanded macro: " << name << " at " << start << ": "
<< quoted(get_source(start, end)) << '\n'
<< " -> " << quoted(text) << '\n';
}
}
template <typename ContextT, typename TokenT, typename ContainerT>
bool expanding_object_like_macro([[maybe_unused]] ContextT& ctx, [[maybe_unused]] TokenT const& macro,
ContainerT const& macrodef, TokenT const& macrocall) {
mCurrentMacro = macrocall;
std::string const name = macro.get_value().c_str();
fs::path const file = macrocall.get_position().get_file().c_str();
// only interested in macros from the current file
if (mSourcePath == file) {
auto const& callPos = macrocall.get_position();
std::string rowCol =
std::to_string(callPos.get_line()) + ':' + std::to_string(callPos.get_column());
std::string const key = name + ":" + file.string() + ':' + rowCol;
// adjust the ending position
auto endPos = callPos;
endPos.set_column(endPos.get_column() + mCurrentMacro.get_value().size());
std::string expanded;
for (auto const& token : macrodef) {
expanded += token.get_value().c_str();
}
// std::cout << "expanding_object_like_macro: " << expanded << '\n';
registerExpansion(name, callPos, endPos, expanded);
// continue with default processing
return false;
}
// do not process further
return true;
}
void registerExpansion(std::string const& name, Position const& callPos, Position const& endPos,
std::string const& text) {
auto surrounding = std::find_if(mExpansions.begin(), mExpansions.end(), [&](auto const& exp) {
return exp.start <= callPos && exp.end >= endPos;
});
if (surrounding == mExpansions.end()) {
// if not nested
mExpansions.push_back({name, callPos, endPos, text});
} else {
std::cout << "note: " << name << " at " << callPos << " nested in " << surrounding->name
<< " at " << surrounding->start << "\n";
}
}
template <typename ContextT, typename TokenT, typename ContainerT, typename IteratorT>
bool
expanding_function_like_macro([[maybe_unused]] ContextT const& ctx,
[[maybe_unused]] TokenT const& macrodef,
[[maybe_unused]] std::vector<TokenT> const& formal_args,
[[maybe_unused]] ContainerT const& definition, TokenT const& macrocall,
[[maybe_unused]] std::vector<ContainerT> const& arguments,
[[maybe_unused]] IteratorT const& seqstart, IteratorT const& seqend) {
mCurrentMacro = macrocall;
std::string const name = macrocall.get_value().c_str();
fs::path const file = macrocall.get_position().get_file().c_str();
// only interested in macros expanding into the current file
if (mSourcePath == file) {
auto const& callPos = macrocall.get_position();
registerExpansion(name, callPos, seqend->get_position(), "");
// continue with default processing
return false;
}
// do not process further
return true;
}
template <typename ContextT, typename ContainerT>
void expanded_macro([[maybe_unused]] ContextT const& ctx, ContainerT const& tokens) {
std::string expanded;
for (auto const& token : tokens) {
expanded += token.get_value().c_str();
}
// clean up the macro expansion text - removing
// multiple lines & extra unnecessary whitespace
std::erase(expanded, '\n');
auto end = std::unique(expanded.begin(), expanded.end(), [](auto lhs, auto rhs) {
return (lhs == rhs) && ((lhs == ' ') || (lhs == '\t'));
});
expanded.erase(end, expanded.end());
// std::cout << "Expanded macro: " << expanded << '\n';
if (auto it = mExpansions.rbegin(); it != mExpansions.rend())
it->text = expanded;
}
template <typename ContextT, typename ContainerT>
void rescanned_macro([[maybe_unused]] ContextT const& ctx, ContainerT const& tokens) {
auto const& expansionPos = tokens.begin()->get_position();
if (mSourcePath == expansionPos.get_file().c_str()) {
std::string expanded;
for (auto iter = tokens.begin(); iter != tokens.end(); ++iter) {
expanded += iter->get_value().c_str();
}
// compress expanded text to a single line (removing unnecessary whitespace)
std::erase(expanded, '\n');
auto end = std::unique(expanded.begin(), expanded.end(), [](auto lhs, auto rhs) {
return (lhs == rhs) && ((lhs == ' ') || (lhs == '\t'));
});
expanded.erase(end, expanded.end());
// std::cout << "Rescanned macro: " << expanded << '\n';
// auto startPos = tokens.begin()->get_position(), endPos = tokens.back().get_position();
// std::cout << "Rescanned range: [" << startPos << " .. " << endPos << "]\n";
if (auto it = mExpansions.rbegin(); it != mExpansions.rend())
it->text = expanded;
}
}
fs::path mSourcePath;
std::string const& mCachedSource;
std::string get_source(Position const& b, Position const& e) {
// TODO error handling and position validation
auto get_offs = [&](Position const& b) {
auto it = mCachedSource.begin();
auto line = b.get_line();
while (--line)
it = std::find(it, mCachedSource.end(), '\n') + 1;
return static_cast<size_t>(it - mCachedSource.begin()) + b.get_column() - 1;
};
auto beg = get_offs(b), end = get_offs(e);
return mCachedSource.substr(beg, end - beg + 1);
}
// Expansion key consists of macro name and the
// starting location where it is invoked in the sourcePath
using Token = wave::cpplexer::lex_token<>;
struct Expansion {
std::string name;
Position start, end;
std::string text;
};
Token mCurrentMacro;
std::vector<Expansion> mExpansions;
};
} // namespace
int main(int argc, char* argv[]) {
using namespace wave;
if (argc < 2) {
std::cerr << "Usage: expand_macros [input file]" << '\n';
return -1;
}
// current file position is saved for exception handling
Position current_position;
try {
// Open and read in the specified input file.
std::ifstream instream(argv[1], std::ios::binary);
if (!instream.is_open()) {
std::cerr << "Could not open input file: " << argv[1] << '\n';
return -2;
}
std::string const source = std::string(std::istreambuf_iterator<char>(instream), {});
// The preprocessor iterator shouldn't be constructed directly. It is
// to be generated through a wave::context<> object. This wave:context<>
// object additionally may be used to initialize and define different
// parameters of the actual preprocessing (not done here).
//
// The preprocessing of the input stream is done on the fly behind the
// scenes during iteration over the context_type::iterator_type stream.
context_type ctx(source.begin(), source.end(), argv[1], my_hooks{argv[1], source});
// This is where we add the project include paths
std::vector<std::string> includePaths = {
fs::current_path().string(), // for COLIRU
"C:/Users/johnc/main/tcdu-cdu/include",
"C:/Users/johnc/main/tcdu-cdu/src/cdu/include",
};
// These include paths are part of the compiler toolchain, note that these
// include paths allow for either VS2022 preview or Community to be present.
// Also, the apex folder is added here as it should be on the system
// include path list.
std::vector<std::string> systemIncludePaths = {
"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/include",
"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/atlmfc/include",
"C:/Program Files/Microsoft Visual Studio/2022/Preview/VC/Tools/MSVC/14.41.33923/include",
"C:/Program Files/Microsoft Visual Studio/2022/Preview/VC/Tools/MSVC/14.41.33923/atlmfc/include",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/ucrt",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/shared",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/um",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/winrt",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/cppwinrt",
"C:/Users/johnc/main/tcdu-cdu/include/apex",
};
// Copied from visual studio preprocessor settings.
// Not sure why RC_INVOKED is required.
std::vector<std::string> preprocessorDefines = {
"_UNICODE", "UNICODE", "_CRT_SECURE_NO_WARNINGS", "WIN32_LEAN_AND_MEAN",
"UNIT_TEST=1", "RC_INVOKED"};
// set various options
for (auto const& next : includePaths)
ctx.add_include_path(next.data());
for (auto const& next : systemIncludePaths)
ctx.add_sysinclude_path(next.data());
for (auto const& next : preprocessorDefines)
ctx.add_macro_definition(next.data());
ctx.set_language(boost::wave::support_cpp2a);
ctx.set_language(enable_preserve_comments(ctx.get_language()));
ctx.set_language(enable_prefer_pp_numbers(ctx.get_language()));
ctx.set_language(enable_single_line(ctx.get_language()));
// Analyze the input file
for (auto first = ctx.begin(), last = ctx.end(); first != last; ++first) {
current_position = first->get_position();
// std::cout << first->get_value();
}
} catch (boost::wave::cpp_exception const& e) {
// some preprocessing error
std::cerr << e.file_name() << "(" << e.line_no() << "): " << e.description() << '\n';
return 2;
} catch (std::exception const& e) {
// Use last recognized token to retrieve the error position
std::cerr << current_position << ": exception caught: " << e.what() << '\n';
return 3;
} catch (...) {
// use last recognized token to retrieve the error position
std::cerr << current_position << "): unexpected exception caught." << '\n';
return 4;
}
}
Printing
note: MAX at /tmp/1722301871-348899440/input.cpp:4:22 nested in MIN at /tmp/1722301871-348899440/input.cpp:4:15
note: MAX at /tmp/1722301871-348899440/input.cpp:5:22 nested in MIN at /tmp/1722301871-348899440/input.cpp:5:15
Expanded macro: MIN at /tmp/1722301871-348899440/input.cpp:4:15: "MIN(1, MAX(3, 4))"
-> "(((1) <= ( (((3) > ( 4)) ? (3) : ( 4)))) ? (1) : ( (((3) > ( 4)) ? (3) : ( 4))))"
Expanded macro: MIN at /tmp/1722301871-348899440/input.cpp:5:15: "MIN(1, MAX(3, 7))"
-> "(((1) <= ( (((3) > ( 7)) ? (3) : ( 7)))) ? (1) : ( (((3) > ( 7)) ? (3) : ( 7))))"