c++regexboostboost-regex

Boost regex_replace exception: "...This exception is thrown to prevent "eternal" matches..." being thrown on occasion


I am using Boost.Regex(boost-1.42) to remove the first line of a multi-line string(a fairly large string containing multiple lines ending in '\n').

i.e. using regex_replace to do something akin to s/(.*?)\n//

  string
  foo::erase_first_line(const std::string & input) const
  {
    static const regex line_expression("(.*?)\n");
    string  empty_string;

    return boost::regex_replace(input,
                                line_expression,
                                empty_string,
                                boost::format_first_only);
  }

This code is throwing the following exception:

"terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::runtime_error> >'
  what():  The complexity of matching the regular expression exceeded predefined bounds.  Try refactoring the regular expression to make each choice made by the state machine unambiguous.  This exception is thrown to prevent "eternal" matches that take an indefinite period time to locate."

Interestingly/annoyingly, this doesn't seem to happen in test programs with the same test data. Any thoughts on why this could be happening and/or how to fix it?


Solution

  • Try putting a "beginning of string" marker ("\A" in the default Perl-compatible mode) at the beginning of the regex, to make it more explicit that you want it to match just the first line.

    Without explicitly matching the beginning to the string, it looks like boost is applying its "leftmost longest" rule and that's what's causing this: http://www.boost.org/doc/libs/1_45_0/libs/regex/doc/html/boost_regex/syntax/leftmost_longest_rule.html