My Java standalone application gets a URL (which points to a file) from the user and I need to hit it and download it. The problem I am facing is that I am not able to encode the HTTP URL address properly...
Example:
URL: http://search.barnesandnoble.com/booksearch/first book.pdf
java.net.URLEncoder.encode(url.toString(), "ISO-8859-1");
returns me:
http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst+book.pdf
But, what I want is
http://search.barnesandnoble.com/booksearch/first%20book.pdf
(space replaced by %20)
I guess URLEncoder
is not designed to encode HTTP URLs... The JavaDoc says "Utility class for HTML form encoding"... Is there any other way to do this?
The java.net.URI class can help; in the documentation of URL you find
Note, the URI class does perform escaping of its component fields in certain circumstances. The recommended way to manage the encoding and decoding of URLs is to use an URI
Use one of the constructors with more than one argument, like:
URI uri = new URI(
"http",
"search.barnesandnoble.com",
"/booksearch/first book.pdf",
null);
URL url = uri.toURL();
//or String request = uri.toString();
(the single-argument constructor of URI does NOT escape illegal characters)
Only illegal characters get escaped by above code - it does NOT escape non-ASCII characters (see fatih's comment).
The toASCIIString
method can be used to get a String only with US-ASCII characters:
URI uri = new URI(
"http",
"search.barnesandnoble.com",
"/booksearch/é",
null);
String request = uri.toASCIIString();
For an URL with a query like http://www.google.com/ig/api?weather=São Paulo
, use the 5-parameter version of the constructor:
URI uri = new URI(
"http",
"www.google.com",
"/ig/api",
"weather=São Paulo",
null);
String request = uri.toASCIIString();