javaurlurimultibyteutf

Invalid URI with Chinese characters (Java)


Having trouble setting up a URL connection with Chinese characters in the URL. It works with Latin characters:

String xstr = "维也纳恩斯特哈佩尔球场" ;
URI uri = new URI("http","ajax.googleapis.com","/ajax/services/language/detect","v=1.0&q="+xstr,null);   
URL url = uri.toURL(); 
URLConnection connection = url.openConnection();
InputStream is = connection.getInputStream() ;

The getInputStream() call results in:

java.lang.IllegalArgumentException: Invalid uri 'http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=???????????': Invalid query

Solution

  • The problem is caused by the fact that URI.toURL() doesn't percent-encode non-ASCII characters. Use the following instead:

    URL url = new URL(uri.toASCIIString());