javahttprfc5988

Is a URI containing a comma valid in a HTTP Link header?


Is the following HTTP Link header, containing a comma, valid?

Link: <http://www.example.com/foo,bar.html>; rel="canonical"

RFC5988 says:

Note that extension relation types are REQUIRED to be absolute URIs in Link headers, and MUST be quoted if they contain a semicolon (";") or comma (",") (as these characters are used as delimiters in the header itself).

This doesn't cover the #link-value however. That must be a URI-Reference as per RFC 3987 which seems to allow this. The link header itself can also have multiple values, from RFC5988 section 5.5:

Link: </TheBook/chapter2>;
      rel="previous"; title*=UTF-8'de'letztes%20Kapitel,
      </TheBook/chapter4>;
      rel="next"; title*=UTF-8'de'n%c3%a4chstes%20Kapitel 

I'm parsing this link header in Java using BasicHeaderValueParser from Apache HttpCore 4.4.9 using the following code:

final String linkHeader = "<http://www.example.com/foo,bar.html>; rel=\"canonical\"";
final HeaderElement[] parsedHeaders = BasicHeaderValueParser.parseElements(linkHeader, null);
        
for (HeaderElement headerElement : parsedHeaders)
{
    System.out.println(headerElement);
}

which tokenises on the comma and prints the following:

<http://www.example.com/foo
bar.html>; rel=canonical

Is this valid behaviour?


Solution

  • The comma is of course valid.

    What you're missing is that the BasicHeaderValueParser is not generic. It only supports certain HTTP header fields, and "Link" isn't one of them (see syntax description in https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/message/HeaderValueParser.html.