I was always pretty curious about the C-like interface of boost::spirit::qi::phrase_parse()
leaving only the option to report that something(tm) has failed.
Chatgpt told me that in order to get rich error information one could use
#define BOOST_SPIRIT_ASSERT_EXCEPTION parseErrorClass
Chatgpt also told me, that for this case there is no special parsing routine to replace boost::spirit::qi::phrase_parse()
.
Does this mean, that for this case one can safely ignore the boolean and the returned iterator?
The return value isn't the only (or even main) error reporting unit.
Of course, at the parser level it is the mechanism, because it enables Spirit to act as the parser-combinator to compose larger grammars: e.g. a | b
wouldn't work if a
throws before b
can be attempted.
To get exactly that behaviour one can express expectation points.
In ((a >> b) | c)
the first branch will be backtracked if a
matched but b
didn't, this allows c
to be attempted.
If that's not desired, you can instead use ((a > b) | c)
. If b
doesn't match, an expectation failure is raised, so c
won't even be attempted.
You'd would use this e.g. if
a
was a keyword that cannot be valid in any other context.
Handling the expectation failures you can report rich error context to the user, e.g. in this recent answer
//...
catch (qi::expectation_failure<It> const& ef) {
auto f = begin(input);
auto p = ef.first - input.begin();
auto bol = input.find_last_of("\r\n", p) + 1;
auto line = std::count(f, f + bol, '\n') + 1;
auto eol = input.find_first_of("\r\n", p);
std::cerr << " -> EXPECTED " << ef.what_ << " in line:" << line << "\n"
<< input.substr(bol, eol - bol) << "\n"
<< std::setw(p - bol) << ""
<< "^--- here" << std::endl;
}
Printing (in that example)
-> EXPECTED ";" in line:3
Class Simple my_value datatype restriction;
^--- here
Expectation failures are also "soft-failures" and can even be handled at the rule-level: qi::on_eror
.
The error handler can
just report the error
throw an enriched error (e.g. with semantic information - like "type mismatch" or "unknown identifier" etc.),
it can e.g. implement a "sync scan" (in similar way like Coco/C++, Yacc etc do error recovery) and resume
it can "change the world" and just qi::error_handler_result::retry
or qi::error_handler_result::accept
; you might e.g. add a missing symbol to a lookup table, or in theory you could alter the input and resume parsing with the altered input
do any of the above and return qi::error_handler_result::fail
which allows the rule to backtrack after all (effectively removing the expectation point)
Here is some demonstrations
Not that the on_error
"aspect" generalizes with on_success
to add behaviour to every rule, e.g. to attach source location information to AST nodes:
Note also, I wouldn't suggest doing complicated operations inside the error-handler. E.g. here's a very complete example showing how to handle semantic/syntax errors and continue parsing with diagnostics/suggested fixes without using on_error
. Instead all the logic is in semantics actions, which IMO is the better place for it: How to provider user with autocomplete suggestions for given boost::spirit grammar?
Semantic Actions can also fail a rule by assigning false
to the qi::_pass
context placeholder.
If you want to always report errors by exception, you can replace
bool ok = qi::parse(f, l, p, attr);
with
qi::parse(f, l, qi::eps > p, attr);
In fact, to also avoid the anti-pattern of checking if (ok && f == l)
, add another expectation:
qi::parse(f, l, qi::eps > p > qi::eoi, attr);
There's nothing C-ish about the Qi parser interface. Not by a long shot. In fact, it is the C++-est parser interface I know of (with the exception of metaparse
?). Sometimes to a fault. Above you find all the ways in which I find that to be illustrated.
I have no idea about the magic BOOST_SPIRIT_ASSERT_EXCEPTION
define and suggest you ask chatGPT about the documentation for it :)