javaurlconstruction

java.net.URL bug in constructing URLs?


The construct new URL(new URL(new URL("http://localhost:4567"), "abc"), "def") produces (imho incorrectly) this url: http://localhost:4567/def

While the construct new URL(new URL(new URL("http://localhost:4567"), "abc/"), "def") produces the correct (wanted by me) url: http://localhost:4567/abc/def

The difference is a trailing slash in abc constructor argument.

Is this intended behavior or this is a bug that should be fixed in URL class?
After all the idea is not to worry about slashes when you use some helper class for URL construction.


Solution

  • Quoting javadoc of new URL(URL context, String spec):

    Otherwise, the path is treated as a relative path and is appended to the context path, as described in RFC2396.

    See section 5 "Relative URI References" of the RFC2396 spec, specifically section 5.2 "Resolving Relative References to Absolute Form", item 6a:

    All but the last segment of the base URI's path component is copied to the buffer. In other words, any characters after the last (right-most) slash character, if any, are excluded.

    Explanation

    On a web page, the "Base URI" is the page address, e.g. http://example.com/path/to/page.html. A relative link, e.g. <a href="page2.html">, must be interpreted as a sibling to the base URI, so page.html is removed, and page2.html is added, resulting in http://example.com/path/to/page2.html, as intended.

    The Java URL class implements this logic, and that is why you get what you see, and it is entirely the way it is supposed to work.

    It is by design, i.e. not a bug.