
How do I do regex substitutions with multiple capture groups?

I'm trying to allow users to filter strings of text using a glob pattern whose only control character is *. Under the hood, I figured the easiest thing to filter the list strings would be to use Js.Re.test[], and it is (easy).

Ignoring the * on the user filter string for now, what I'm having difficulty with is escaping all the RegEx control characters. Specifically, I don't know how to replace the capture groups within the input text to create a new string.

So far, I've got this, but it's not quite right:

let input = "test^ing?123[foo";

let escapeRegExCtrl = searchStr => {
    let re = [%re("/([\\^\\[\\]\\.\\|\\\\\\?\\{\\}\\+][^\\^\\[\\]\\.\\|\\\\\\?\\{\\}\\+]*)/g")];

    let break = ref(false);
    while (!break.contents)  {
        switch (Js.Re.exec_ (re, searchStr)) {
            | Some(result) => {
                let match = Js.Re.captures(result)[0];
                Js.log2("Matching: ", match)
            | None => {
                break := true;
search -> escapeRegExCtrl

If I disregard the "test" portion of the string being skipped, the above output will produce:

Matching: ^ing  
Matching: ?123 
Matching: [foo

With the above example, at the end of the day, what I'm trying to produce is this (with leading and following .*:


But I'm unsure how to achieve creating a contiguous string from the matched capture groups.

(echo "test^ing?123[foo" | sed -r 's_([\^\?\[])_\\\1_g' would get the work done on the command line)


Based on Chris Maurer's answer, there is a method in the JS library that does what I was looking for. A little digging exposed the ReasonML proxy for that method:


  • Let me see if I have this right; you want to implement a character matcher where everything is literal except *. Presumably the * is supposed to work like that in Windows dir commands, matching zero or more characters.

    Furthermore, you want to implement it by passing a user-entered character string directly to a Regexp match function after suitably sanitizing it to only deal with the *.

    If I have this right, then it sounds like you need to do two things to get the string ready for

    1. Quote all the special regex characters, and
    2. Turn all instances of * into .* or maybe .*?

    Let's keep this simple and process the string in two steps, each one using So the list of special characters in regex are [^$.|?*+(). Suitably quoting these for replace:

    str.replace(/[\[\\\^\$\.\|\?\+\(\)]/g, '\$&')

    This is just all those special characters quoted. The $& in the replacement specifications says to insert whatever matched. Then pass that result to a second replace for the * to .*? transformation.

    str.replace(/*+/g, '.*?')