c++regexboostboost-regex

boost::algorithim::split_regex


For the function

boost::split_regex(std::vector<std::string>, std::string, std::string);

I end up with Empty tokens and I would like to compress them, but unlike boost::split, I cannot find a token_compress_on option for regex_split. As it is still undocumented (see below), I was wondering if anyone had any pointers as to how to go about this?

From: https://www.boost.org/doc/libs/1_81_0/libs/algorithm/doc/html/index.html

Not-yet-documented Other Algorithms
Reference
Header <boost/algorithm/algorithm.hpp>
Header <boost/algorithm/apply_permutation.hpp>
Header <boost/algorithm/clamp.hpp>
Header <boost/algorithm/cxx11/all_of.hpp>
Header <boost/algorithm/cxx11/any_of.hpp>
Header <boost/algorithm/cxx11/copy_if.hpp>
Header <boost/algorithm/cxx11/copy_n.hpp>
Header <boost/algorithm/cxx11/find_if_not.hpp>
Header <boost/algorithm/cxx11/iota.hpp>
Header <boost/algorithm/cxx11/is_partitioned.hpp>
Header <boost/algorithm/cxx11/is_permutation.hpp>
Header <boost/algorithm/cxx14/is_permutation.hpp>
Header <boost/algorithm/cxx11/is_sorted.hpp>
Header <boost/algorithm/cxx11/none_of.hpp>
Header <boost/algorithm/cxx11/one_of.hpp>
Header <boost/algorithm/cxx11/partition_copy.hpp>
Header <boost/algorithm/cxx11/partition_point.hpp>
Header <boost/algorithm/cxx14/equal.hpp>
Header <boost/algorithm/cxx14/mismatch.hpp>
Header <boost/algorithm/cxx17/exclusive_scan.hpp>
Header <boost/algorithm/cxx17/for_each_n.hpp>
Header <boost/algorithm/cxx17/inclusive_scan.hpp>
Header <boost/algorithm/cxx17/reduce.hpp>
Header <boost/algorithm/cxx17/transform_exclusive_scan.hpp>
Header <boost/algorithm/cxx17/transform_inclusive_scan.hpp>
Header <boost/algorithm/cxx17/transform_reduce.hpp>
Header <boost/algorithm/find_backward.hpp>
Header <boost/algorithm/find_not.hpp>
Header <boost/algorithm/gather.hpp>
Header <boost/algorithm/hex.hpp>
Header <boost/algorithm/is_clamped.hpp>
Header <boost/algorithm/is_palindrome.hpp>
Header <boost/algorithm/is_partitioned_until.hpp>
Header <boost/algorithm/minmax.hpp>
Header <boost/algorithm/minmax_element.hpp>
Header <boost/algorithm/searching/boyer_moore.hpp>
Header <boost/algorithm/searching/boyer_moore_horspool.hpp>
Header <boost/algorithm/searching/knuth_morris_pratt.hpp>
Header <boost/algorithm/sort_subrange.hpp>
Header <boost/algorithm/string.hpp>
Header <boost/algorithm/string_regex.hpp>

Solution

  • There's not much use for it with a delimiter pattern, because you can always just use (pattern)+ instead of pattern to have the desired effect:

    Live On Coliru

    #include <boost/algorithm/string_regex.hpp>
    
    auto tokenize(std::string_view input, std::string delim) {
        boost::regex re(delim);
        std::vector<std::string> tokens;
        split_regex(tokens, input, re);
        return tokens;
    }
    
    #include <fmt/ranges.h>
    int main() {
        fmt::print("{}\n", tokenize("a,,b,c,,,d", ","));
        fmt::print("{}\n", tokenize("a,,b,c,,,d", ",+"));
    }
    

    Prints

    ["a", "", "b", "c", "", "", "d"]
    ["a", "b", "c", "d"]