perl

Regex to extract one part of the URI


Here's the full URI:

https://example.com/entry/#/view/TCMaftR7cPYyC3q61TnI6_Mx8PwDTsnVyo9Z6nsXHDRzrN5ftuXxHN7NvIGK34-z/366792786/aHR0cHM6Ly9lcGwuaXJpY2EuZ292LmlyL0ltZWlBZnRlclJlZ2lzdGVyP2ltZWk9MzU5NzQ0MzkxMDc2Mjg4

I want to extract the base64 string after /view/ and before the numeric part in this case 366792786:

This is the part I'm trying to match:

TCMaftR7cPYyC3q61TnI6_Mx8PwDTsnVyo9Z6nsXHDRzrN5ftuXxHN7NvIGK34-z

I've managed to go this far:

my $uri = "https://example.com/entry/#/view/TCMaftR7cPYyC3q61TnI6_Mx8PwDTsnVyo9Z6nsXHDRzrN5ftuXxHN7NvIGK34-z/366792786/aHR0cHM6Ly9lcGwuaXJpY2EuZ292LmlyL0ltZWlBZnRlclJlZ2lzdGVyP2ltZWk9MzU5NzQ0MzkxMDc2Mjg4";
if ($uri =~ m/view\/(.+)\//g) {
    print $&;
}

But, it only produces the whole thing:

view/TCMaftR7cPYyC3q61TnI6_Mx8PwDTsnVyo9Z6nsXHDRzrN5ftuXxHN7NvIGK34-z/366792786/

Please help me find the regex.


Solution

  • You should use $1 instead of $& to capture what is in the capturing parentheses.

    Also:


    my $uri = "https://example.com/entry/#/view/TCMaftR7cPYyC3q61TnI6_Mx8PwDTsnVyo9Z6nsXHDRzrN5ftuXxHN7NvIGK34-z/366792786/aHR0cHM6Ly9lcGwuaXJpY2EuZ292LmlyL0ltZWlBZnRlclJlZ2lzdGVyP2ltZWk9MzU5NzQ0MzkxMDc2Mjg4";
    if ($uri =~ m{view/([^/]+)}) {
        print $1;
    }
    

    Outputs:

    TCMaftR7cPYyC3q61TnI6_Mx8PwDTsnVyo9Z6nsXHDRzrN5ftuXxHN7NvIGK34-z