c++regexc++11

std::regex_replace to replace multiple combinations


I am trying to utilize the std::regex_replace to replace a few combinations of characters with other characters. However, each set of characters has a different set of characters I would like to replace it with (it depends on the character).

I thought about using a map , but that will not work as I do not have access to the character being replaced (as seen in the code). Any idea how to cover all replacement cases in one regex_replace statement (or perhaps a better approach) ?

Current Code:

std::regex r(" c|ph|th|ea|c|w");
    map<string, string> comb{
        {" c", "k"},
        { "ph", "f" },  
        { "th", "z" },
        { "ea", "e" },
        { "c", "s" },
        { "w", "v" }
    };
line = std::regex_replace(line, r, comb[r]);

Solution

  • We can hack together our own custom regex_replace using a regex_iterator and your map:

    Live Demo

    First, let's write function that accepts a std::map<string, string> and returns a regex that is just an alternation | between the keys. This will let us keep a strong association between the map and the resulting regex.

    std::regex regex_from_map(const std::map<std::string, std::string>& map)
    {
        std::string pattern_str = "(";
        auto it = map.begin();
        if (it != map.end())
        {
            pattern_str += it->first;
            for(++it; it != map.end(); ++it)
                pattern_str += "|" + it->first;
        }
        pattern_str += ")";
        return std::regex(pattern_str);
    }
    

    Next, let's write a function that accepts the text to be matched and your replacement map, then iterates over each match and finds the appropriate replacement to build the result string:

    std::string custom_regex_replace(const std::string& text,
        const std::map<std::string, std::string>& replacement_map)
    {
        auto regex = regex_from_map(replacement_map);
        std::string result;
        std::sregex_iterator it(text.begin(), text.end(), regex);
        std::sregex_iterator end;
    
        size_t last_pos = 0;
        for (; it != end; ++it) {
            result += text.substr(last_pos, it->position() - last_pos);
            result += replacement_map.at(it->str());
            last_pos = it->position() + it->length();
        }
        result += text.substr(last_pos, text.size() - last_pos);
    
        return result;
    }
    

    Finally, calling our custom replacement function:

    int main() {
        std::map<std::string, std::string> replacement_map = 
        {   {" c", "k"},
            { "ph", "f" },  
            { "th", "z" },
            { "ea", "e" },
            { "c", "s" },
            { "w", "v" }
        };
    
        std::string text = "each word pheels new, cnow?";
        std::string new_text = custom_regex_replace(text, replacement_map);
        std::cout << new_text << std::endl;
    
        return 0;
    }
    

    Input:

    "each word pheels new, cnow?"

    Output:

    "esh vord feels nev,knov?"

    (Note that " c" is replace with "k", so the space after the comma was deleted)