c++boost-spiritboost-spirit-qiboost-spirit-lex

Boost::spirit illegal_backtracking exception


I use Boost.Spirit.Lex and .Qi for a simple calculator project and (as usual) it gives me some pain to debug and use. The debug prints:

<expression>
  <try>boost::spirit::multi_pass::illegal_backtracking

This exception is thrown and I can't understand why. I use macros in my code and it would be a pain to give a minimal example so I give the whole project. Just do "make" at the root, and then launch ./sash, a prompt will appear, if you want to test just do "-echo $5-8".

It seems that Google didn't find any similar problems about this exception...

The parser is in arithmetic/, and the call of the parser is at the end of arithmetic/evaluator.cpp

Any helps greatly appreciate.


Solution

  • Your code is breaking because BOOST_SPIRIT_QI_DEBUG as well as the on_error<> handler seem to use iterators after they might have been invalidated.

    To be honest, I'm not completely sure how this could happen.

    Background

    AFAICT lexertl uses spirit::multipass<> with a split_functor input policy and a split_std_deque storage policy [1].

    Now, (luckily?) the checking policy is buf_id_check which means that the iterator will check for invalidation at the time of dereference.

    Iterators are expected to be invalidated if

    1. the iterator is derefenced, growing the buffer to >16 tokens and the iterator is the only one referring to the shared state.
    2. or somewhere along the line clear_queue is called explicitely (e.g. from the flush_multi_path primitive in the Spirit Repository)

    Honestly I don't see any of these two conditions being met. A quick and dirty

    token_iterator_type clone = iter; // just to make it non-unique...
    

    in evaluator.cpp doesn't make a difference (ruling out reason #1)

    Temporary disabling the docheck implementation in the buf_id_check_policy made valgrind point out that on_error<> and BOOST_SPIRIT_DEBUG* are causing invalid memory references. Commenting both indeed makes all problems go away (and the eval_expression now works).

    However, this is likely not your preferred solution.

    Proposed solution

    Since

    I've quickly refactored some code: https://github.com/sehe/sash-refactor/commits/master

    Now, you will find that your code is generally much simpler, also much shorter, not running into multi_pass limits and you can still have SPIRIT_DEBUG as well as on_error handling :) In the end

    Here's some samples (without debug output):

    $ ./sash <<< '-echo $8-9'
        -1
        Warning: Empty environment variable "8-9".
    $ ./sash <<< '-echo $8\*9'
        72
        Warning: Empty environment variable "8*9".
    $ ./sash <<< '-echo $8\*(9-1)'
        64
        Warning: Empty environment variable "8*(9-1)".
    $ ./sash <<< '-echo $--+-+8\*(9-1)'
        -64
        Warning: Empty environment variable "--+-+8*(9-1)".
    

    [1] Which, despite it's name, buffers previously seen tokens in a std::vector<>