perlpathfilepath

How to normalize a path in Perl? (without checking the filesystem)


I want the Perl's equivalent of Python's os.path.normpath():

Normalize a pathname by collapsing redundant separators and up-level references so that A//B, A/B/, A/./B and A/foo/../B all become A/B. This string manipulation may change the meaning of a path that contains symbolic links. […]

For instance, I want to convert '/a/../b/./c//d' into /b/c/d.

The path I'm manipulating does NOT represent a real directory in the local file tree. There are no symlinks involved. So a plain string manipulation works fine.

I tried Cwd::abs_path and File::Spec, but they don't do what I want.

my $path = '/a/../b/./c//d';

File::Spec->canonpath($path);
File::Spec->rel2abs($path, '/');
# Both return '/a/../b/c/d'.
# They don't remove '..' because it might change
# the meaning of the path in case of symlinks.

Cwd::abs_path($path);
# Returns undef.
# This checks for the path in the filesystem, which I don't want.

Cwd::fast_abs_path($path);
# Gives an error: No such file or directory

Possibly related link:


Solution

  • Given that File::Spec is almost what I needed, I ended up writing a function that removes ../ from File::Spec->canonpath(). The full code including tests is available as a GitHub Gist.

    use File::Spec;
    
    sub path_normalize_by_string_manipulation {
        my $path = shift;
    
        # canonpath does string manipulation, but does not remove "..".
        my $ret = File::Spec->canonpath($path);
    
        # Let's remove ".." by using a regex.
        while ($ret =~ s{
            (^|/)              # Either the beginning of the string, or a slash, save as $1
            (                  # Followed by one of these:
                [^/]|          #  * Any one character (except slash, obviously)
                [^./][^/]|     #  * Two characters where
                [^/][^./]|     #    they are not ".."
                [^/][^/][^/]+  #  * Three or more characters
            )                  # Followed by:
            /\.\./             # "/", followed by "../"
            }{$1}x
        ) {
            # Repeat this substitution until not possible anymore.
        }
    
        # Re-adding the trailing slash, if needed.
        if ($path =~ m!/$! && $ret !~ m!/$!) {
            $ret .= '/';
        }
    
        return $ret;
    }