phpsanitizefilter-varfilter-input

PHP FILTER_SANITIZE_URL swedish domain name


I am experimenting with filter_input and filter_var and I am currently trying to sanitize URLs with FILTER_SANITIZE_URL.

The test program gets input from a GET variable which consists of a URL, (ex. foo.com/bar.php?a=http://www.domain.se). It works fine as long as I don't use swedish domain names. Ex: (foo.com/bar.php?a=http://www.äta.se) gets sanitized to where a = http://www.ta.se which obviously isn't the same.


Solution

  • Domains with special characters are technically not transferred with non-ASCII characters (like the ä in your case), they are punycode encoded. The calling program should encode it's URLs accordingly.

    See:
    http://en.wikipedia.org/wiki/Internationalized_domain_name
    http://en.wikipedia.org/wiki/Punycode

    Example:
    http://www.äta.se is http://www.xn--ta-uia.se