The parse
function in urllib.parse
can be used to encode url components. But its behavior is different from the standard javascript encoder.
In python
>>> import urllib
>>> urllib.parse.quote('(a+b)')
... '%28a%2Bb%29'
in Javascript
>>> encodeURIComponent('(a+b)')
... "(a%2Bb)"
Why is the python function more "strict" when encoding the url component?
If I understood it right, brackets are not reserved characters in urls. So I don't understand why they are escaped in the urllib parse function.
As of RFC 3986, brackets are reserved.
By default, Python will percent-encode every character passed to quote()
except for _.-/
. However, quote()
is tunable. If you want strict RFC 3986 behavior, set safe
to '~'
:
urllib.parse.quote(string, safe='~')
If you want to minimally match javascript-on-your-platform's behavior that you showed (you didn't state which parts of which ECMAScript standard it conforms to):
urllib.parse.quote(string, safe='()')