phpregexregex-grouprealpath

Custom realpath() using regex


I want to create my personal realpath() function which uses regex and doesn't expect that file exists.

What I did so far

function my_realpath (string $path): string {
    if ($path[0] != '/') {
        $path = __DIR__.'/../../'.$path;
    }
    
    $path = preg_replace("~/\./~", '', $path);
    $path = preg_replace("~\w+/\.\./~", '', $path); // removes ../ from path

    return $path;
}

What is not correct

The problem is if I have this string:

"folders/folder1/folder5/../../folder2"

it removes only first occurence (folder5/../):

"folders/folder1/../folder2"

Question

How to I remove (with regex) all folders followed by same number of "../" after them?

Examples

"folders/folder1/folder5/../../folder2" -> "folders/folder2"

"folders/folder1/../../../folder2" -> "../folder2"

"folders/folder1/folder5/../folder2" -> "folders/folder1/folder2"

Can we tell regex that: "~(\w+){n}/(../){n}~", n being greedy but same in both groups?


Solution

  • You can use a recursion-based pattern like

    preg_replace('~(?<=/|^)(?!\.\.(?![^/]))[^/]+/(?R)?\.\.(?:/|$)~', '', $url)
    

    See the regex demo. Details:

    See the PHP demo:

    $strings = ["folders/folder1/folder5/../../folder2", "folders/folder1/../../../folder2", "folders/folder1/folder5/../folder2"];
    foreach ($strings as $url) {
        echo preg_replace('~(?<=/|^)(?!\.\.(?![^/]))[^/\n]+/(?R)?\.\.(?:/|$)~', '', $url) . PHP_EOL;
    }
    // => folders/folder2, ../folder2, folders/folder1/folder2
    

    Alternatively, you can use

    (?<![^/])(?!\.\.(?![^/]))[^/]+/\.\.(?:/|$)
    

    See the regex demo. Details:

    See the PHP demo:

    $strings = ["folders/folder1/folder5/../../folder2", "folders/folder1/../../../folder2", "folders/folder1/folder5/../folder2"];
    foreach ($strings as $url) {
        $count = 0;
        do {
            $url = preg_replace('~(?<![^/])(?!\.\.(?![^/]))[^/]+/\.\.(?:/|$)~', '', $url, -1, $count);
        } while ($count > 0);
        echo "$url" . PHP_EOL;
    }
    

    The $count argument in preg_replace('~(?<![^/])(?!\.\.(?![^/]))[^/]+/\.\.(?:/|$)~', '', $url, -1, $count) keeps the number of replacements, and the replacing goes on until no match is found.

    Output:

    folders/folder2
    ../folder2
    folders/folder1/folder2