c++boost-spiritboost-spirit-qi

How to skip (not output) tokens in Boost Spirit?


I'm new to Boost Spirit. I haven't been able to find examples for some simple things. For example, suppose I have an even number of space-delimited integers. (That matches *(qi::int_ >> qi::int_). So far so good.) I'd like to save just the even ones to a std::vector<int>. I've tried a variety of things like *(qi::int_ >> qi::skip[qi::int_]) https://godbolt.org/z/KPToo3xh6 but that still records every int, not just even ones.

#include <stdexcept>

#include <fmt/format.h>
#include <fmt/ranges.h>

#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

// Example based off https://raw.githubusercontent.com/bingmann/2018-cpp-spirit-parsing/master/spirit1_simple.cpp:
// Helper to run a parser, check for errors, and capture the results.
template <typename Parser, typename Skipper, typename ... Args>
void PhraseParseOrDie(
    const std::string& input, const Parser& p, const Skipper& s,
    Args&& ... args)
{
    std::string::const_iterator begin = input.begin(), end = input.end();
    boost::spirit::qi::phrase_parse(begin, end, p, s, std::forward<Args>(args) ...);
    if (begin != end) {
        fmt::print("Unparseable: \"{}\"\n", std::string(begin, end));
    }
}

void test(std::string input)
{
    std::vector<int> out_int_list;

    PhraseParseOrDie(
        // input string
        input,
        // parser grammar
        *(qi::int_ >> qi::skip[qi::int_]),
        // skip parser
        qi::space,
        // output list
        out_int_list);

    fmt::print("test() parse result: {}\n", out_int_list);
}


int main(int argc, char* argv[])
{
    test("12345 42 5 2");

    return 0;
}

Prints

test() parse result: [12345, 42, 5, 2]

Solution

  • You're looking for qi::omit[]:

    *(qi::int_ >> qi::omit[qi::int_])
    

    Note you can also implicitly omit things by declaring a rule without attribute-type (which make it bind to qi::unused_type for silent compatibility).

    Also note that if you're making an adhoc, sloppy grammar to scan for certain "landmarks" in a larger body of text, consider spirit::repository::qi::seek which can be significantly faster and more expressive.

    Finally, note that Spirit X3 comes with a similar seek[] directive out of the box.

    Simplified Demo

    Much simplified: https://godbolt.org/z/EY4KdxYv9

    #include <fmt/ranges.h>
    #include <boost/spirit/include/qi.hpp>
    
    // Helper to run a parser, check for errors, and capture the results.
    void test(std::string const& input)
    {
        std::vector<int> out_int_list;
    
        namespace qi = boost::spirit::qi;
    
        qi::parse(input.begin(), input.end(),                            //
                qi::expect[                                            //
                    qi::skip(qi::space)[                               //
                        *(qi::int_ >> qi::omit[qi::int_]) > qi::eoi]], //
                out_int_list);
    
        fmt::print("test() parse result: {}\n", out_int_list);
    }
    
    int main() { test("12345 42 5 2"); }
    

    Prints

    test() parse result: [12345, 5]
    

    But Wait

    Seeing your comment

    // Parse a bracketed list of integers with spaces between symbols
    

    Did you really mean that? Because that sounds a ton more like:

    '[' > qi::auto_ % +qi::graph > ']'
    

    See it live: https://godbolt.org/z/eK6Thzqea

    //#define BOOST_SPIRIT_DEBUG
    #include <fmt/ranges.h>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/qi_auto.hpp>
    //#include <boost/fusion/adapted.hpp>
    
    // Helper to run a parser, check for errors, and capture the results.
    template <typename T> auto test(std::string const& input) {
        std::vector<T> out;
    
        using namespace boost::spirit::qi;
    
        rule<std::string::const_iterator, T()> v = auto_;
        BOOST_SPIRIT_DEBUG_NODE(v);
    
        phrase_parse(                                //
            input.begin(), input.end(),              //
            '[' > -v % lexeme[+(graph - ']')] > ']', //
            space, out);
    
        return out;
    }
    
    int main() {
        fmt::print("ints: {}\n", test<int>("[12345 USD     5 PUT]"));
        fmt::print("doubles: {}\n", test<double>("[ 1.2345 42 -inf 'hello' 3.1415 ]"));
    }
    

    Prints

    ints: [12345, 5]
    doubles: [1.2345, -inf, 3.1415]