boost-spirit

How to debug boost::spirit?


I've read the documentation. I also executed the calc_debug.cpp example and it offers much more readable data than my code. I'm using for every grammar

m_sGrammar.name("property");
BOOST_SPIRIT_DEBUG_NODE(m_sGrammar);

What I'm getting is some kind of xml output, and lots of it and I don't have any clue what to look for.

E.g. the keyword fail is already printed for what are comments at the beginning of the file.

e.g. There is a keyword success being printed and it shows the text matching (which covers multiple separate rules) but it does not specify, which rule was matched. The same text is later mentioned with keyword try.


Solution

  • Reading the debug output takes patience.

    As always. It also takes patience manually stepping through code, tracing registers and variables, it takes patience diagramming your data and tracking the flow, it takes patience to read assembly and seeing where time is being spent. Programming is a frustrating task that takes a lot of patience.

    The debug output is also just a tool in that process.

    You apply the tool so that you can have a kind of "state flow diagram" instead of painstakingly stepping through the arcane template instances involved. Of course, it means that you have a lot of data, but you definitely can learn to handle that a lot better than volatile state debugging (where you can rarely go back and check the state at an earlier).

    In other words, the debug information let's you debug the parser in the same way you implement it: declaratively, instead of imperatively.

    Here are tricks that I use:

    1. Minimize your input. Reduce it until on the thing that you want to understand happens. This could be why something fails to parse, or why it is being accepted (by an unexpected rule).

    2. Minimize the output!

      • Don't enable debug for your skipper if you don't need to see it. In my experience the "source lookahead" already implies what has been skipped. Only if that surprises me, I might debug the skipper too

      • Use an XML-aware editor that makes it easy to "fold" subtrees that you have "seen", perhaps decided are "as expected".

        E.g. in vim I often start out with folding the uninteresting branches, and once I have found where I am looking I'll actually delete those branches (dat, zfat, vatat). When I need to repeat this, I use the information to minimize the input further.

    3. Realize that failure is normal. In the rule R = (a|b|c) you will expect to see a and b fail if the input only matches c. Of course, the point is that R itself succeeds. In fact, if b succeeds, c will not even appear in the debug output (as it doesn't need to be tried).

    4. Start from the back, finding the last lexeme/member of a non-backtracking rule that fails. In repeating constructions (*a, +a, repeat(n, m)[a] and a % b) the last member will always fail (all their branches) before it knows to end.

      An example from my answer yesterday, the end of a successful *task_item parse:

          ...
          <attributes>[[[[[[V, a, r, 1], [=, =], [T, e, s, t]], [&, &], [[[V, a, r, 2], [<, =], 10], [&, &], [[V, a, r, 3]
      , [=, =], [D, o, n, e]]]], [[[W, o, r, d], 32, [O, b, j, e, c, t, i, v, e]], [[[[V, a, r, 3], [=, =], [A]], [|, |], 
      [[[V, a, r, 4], [=, =], [B]], [&, &], [[V, a, r, 5], [>], 0]]], [[[V, a, r, N, a, m, e], [V, a, l, u, e, 1]], [[V, a
      , r, 2], 10]], [[[[V, a, r, 3], [=, =], [C]], [[[V, a, r, N, a, m, e], [S, o, m, e, V, a, l, u, e]]], [empty]]]]], [
      [[[V, a, r, N, a, m, e], [V, a, l, u, e, 2]]]]]]]</attributes>
        </task_item>
        <task_item>
          <try></try>
          <classdef_>
            <try></try>
            <fail/>
          </classdef_>
          <statement_>
            <try></try>
            <assign_>
              <try></try>
              <assign_>
                <try></try>
                <fail/>
              </assign_>
              <fail/>
            </assign_>
            <verify_>
              <try></try>
              <fail/>
            </verify_>
            <conditional_>
              <try></try>
              <fail/>
            </conditional_>
            <fail/>
          </statement_>
          <fail/>
        </task_item>
        <success></success>
        <attributes>[[[[[[V, a, r, 1], [=, =], [T, e, s, t]], [&, &], [[[V, a, r, 2], [<, =], 10], [&, &], [[V, a, r, 3], 
      [=, =], [D, o, n, e]]]], [[[W, o, r, d], 32, [O, b, j, e, c, t, i, v, e]], [[[[V, a, r, 3], [=, =], [A]], [|, |], [[
      [V, a, r, 4], [=, =], [B]], [&, &], [[V, a, r, 5], [>], 0]]], [[[V, a, r, N, a, m, e], [V, a, l, u, e, 1]], [[V, a, 
      r, 2], 10]], [[[[V, a, r, 3], [=, =], [C]], [[[V, a, r, N, a, m, e], [S, o, m, e, V, a, l, u, e]]], [empty]]]]], [[[
      [V, a, r, N, a, m, e], [V, a, l, u, e, 2]]]]]]]</attributes>
      </task_>
      

      In an unsuccessful parse, you will usually have to find for the last specific rule that fails where you expected it to pass.

    5. Keep your rules simple. I've seen your skipper once and realize that its complexity might be why you are debugging it.

    6. DON'T debug! Instead, have good error reporting for expected errors. Instead of giving examples, I'll just refer to this answer that already does: BOOST_SPIRIT_ASSERT_EXCEPTION -- can this be used for reporting a parser error?

    E.g. in that same grammar I worked on yesterday, if you accidentally put Else if instead of the expected Elseif in the input, you will get this error message:

         -> EXPECTED <eoi> in line:2
                If (Var1 == "Test") && (Var2 <= 10) && (Var3 == "Done")
                ^--- here
    

    This could be enough, or help a lot when mavigating the lowlevel debug output.


    The only tangible complaint I see from your question:

    E.g. the keyword fail is already printed for what are comments at the beginning of the file.

    probably means you can remove the skipper from the debugged set.

    UPDATE to the edit:

    e.g. There is a keyword success being printed and it shows the text matching (which covers multiple rules) but it does not specify, which rule was matched.

    That's inaccurate. Success shows the input location reached after the match. It does NOT show the text matching. It shows the text remaining. The text "matched" is no longer of interest, instead it shows the result of the match under attributes (again see the example above)