pythonurlsanitize

How to sanitize url string in Python?


I have a python app that works with URL lists and produces bash script as an output.

How to sanitize URL so that a malicious user could not inject bash commands that will be executed on my infrastructure.

For instance:

http://www.circl.lu/a.php?rm -Rf /etc

Solution

  • I guess urllib is an option to parse the urls, in order to escape harmful characters. At least it looks like a good resource for your use case. See the docs of url-quoting.

    from urllib.parse import quote
    
    quote('touch foo', safe='/')
    quote('rm -Rf /etc', safe='/')
    quote('http://www.circl.lu/a.php?rm -Rf /etc', safe='/:?&')
    
    #'touch%20foo'
    #'rm%20-Rf%20/etc'
    #'http://www.circl.lu/a.php?rm%20-Rf%20/etc'