djangohttp-redirectiri

When do you need to use iri_to_uri after using url_has_allowed_host_and_scheme in Django?


In the Django 3.0 release notes, this comment is made about url_has_allowed_host_and_scheme:

To avoid possible confusion as to effective scope, the private internal utility is_safe_url() is renamed to url_has_allowed_host_and_scheme(). That a URL has an allowed host and scheme doesn’t in general imply that it’s “safe”. It may still be quoted incorrectly, for example. Ensure to also use iri_to_uri() on the path component of untrusted URLs.

I understand what the purpose of url_has_allowed_host_and_scheme is. Take the common use-case of providing a next query parameter, for example: http://example.com/foobar?next=http%3A%2F%2Fexample2.com%2Fhello . You could program the view that handles this path to redirect to the URL provided by the next parameter, in this case: http://example2.com/hello . If the URL is not validated, then this is an "open redirect" vulnerability. Malicious actors could take advantage of an open redirect to hide malicious URLs behind a URL that looks trustworthy.

You can use url_has_allowed_host_and_scheme to ensure that the URL has the expected hostnames and scheme.

My question is concerning iri_to_uri. The documentation implies that you also need to use this function as well. When would I need to use it?


Solution

  • Here's how you would implement a safe redirect:

    from django.utils.http import url_has_allowed_host_and_scheme
    from django.utils.encoding import iri_to_uri
    from django.shortcuts import redirect
    
    def example_view(request):
        if url_has_allowed_host_and_scheme(request.GET['next'], None):
            url = iri_to_uri(request.GET['next'])
            return redirect(url)
        else:
            raise
    

    The iri_to_uri part is necessary to make sure that the end-result URL is quoted correctly. For example:

    The first line in the HTTP request needs to be in a format like this:

    GET /caf%C3%A9/ HTTP/1.0
    

    The URL needs to be escaped there, as it would break the HTTP protocol if it contained something like spaces.

    To be honest, I'm still not entirely sure why iri_to_uri is needed, because Django's utilities like redirect will automatically escape the URL as needed before it ever gets to the wire in the HTTP request.