phpvalidationutf-8filterfilter-var

PHP filter_var() - FILTER_VALIDATE_URL


The FILTER_VALIDATE_URL filter seems to have some trouble validating non-ASCII URLs:

var_dump(filter_var('http://pt.wikipedia.org/wiki/', FILTER_VALIDATE_URL)); // http://pt.wikipedia.org/wiki/
var_dump(filter_var('http://pt.wikipedia.org/wiki/Guimarães', FILTER_VALIDATE_URL)); // false

Why isn't the last URL correctly validated? And what are the possible workarounds? Running PHP 5.3.0.

I'd also like to know where I can find the source code of the FILTER_VALIDATE_URL validation filter.


Solution

  • The parsing starts here:
    http://svn.php.net/viewvc/php/php-src/trunk/ext/filter/logical_filters.c?view=markup

    and is actually done in /trunk/ext/standard/url.c

    At a first glance I can't see anything that purposely rejects non-ASCII characters, so it's probably just lack of unicode support. PHP is not good in handling non-ASCII characters anywhere. :(