I am trying to create a RegEx pattern which will match two parts of a URL.
Example URL:
app.company.com/base-path?parameter1=stuff¶meter2=morestuff¶meter3=IMPORTANT%20THING
In this case I want the pattern to match in the case that there is a base path and the third parameter, so both:
/base-path
and all of parameter3=IMPORTANT%20THING
Here is my answer, and you can test that here
/^.+?(\/.+?)\?.+?&(parameter3=.+)$/gm
I do not know which language you use, this is PCRE2 version which is used for PHP 7.3+, but I think it is easy to migrate to other language.
There are some risk when using regex, for that bad guys can construct malicious parameter1
or parameter2
to spoof regex and you will get unexpected result, especially AFTER DECODING URL.
For example url
app.company.com/base-path?parameter1=stuff¶meter2=%26parameter3%3Dmorestuff¶meter3=IMPORTANT%20THING
Bad guys set parameter2=%26parameter3%3Dmorestuff
, and after decoding, you will get this url
app.company.com/base-path?parameter1=stuff¶meter2=¶meter3=morestuff¶meter3=IMPORTANT THING
And what you get from regex is parameter3=morestuff¶meter3=IMPORTANT THING
, which is unexpected.
So, if you really want to use regex, DO NOT DECODE URL BEFORE MATCHING