Skip to content

Support advanced Link header expressions #2099

@viliam-durina

Description

@viliam-durina

We've decided to use your library to parse Link headers, hoping that you'll correctly implement the specification intricacies, because doing parsing correctly is tricky. However, it's implemented completely naively, working only in the basic cases. It uses regex for parsing, even though it can't be used for non-context-free grammars.

Here are a bunch of examples:

// should have failed
Link.valueOf("foo bar <url>;rel=\"next\"");

// greedy capture until the last `>`
System.out.println(Link.valueOf("<url>;customparam=foo>;rel=\"next\"").getHref());

// multiple rels not supported, should have been parsed to a collection
System.out.println(Link.valueOf("<url>;rel=\"next last\"").getRel().value());

// incorrect unquoting of values - missing end quote must be ignored, but currently "nex" is returned (missing last "t")
System.out.println(Link.valueOf("<url>;rel=\"next").getRel().value());

// param value not unescaped
System.out.println(Link.valueOf("<url>;rel=\"next\";title=\"foo\\\"bar\"").getTitle());

// incorrect semicolon handling - a semicolon within a quoted value is not treated literally
System.out.println(Link.valueOf("<url>;rel=\"next\";title=\"foo;bar\"").getRel().value());

This is the ABNF for the Link header:

  Link       = #link-value
  link-value = "<" URI-Reference ">" *( OWS ";" OWS link-param )
  link-param = token BWS [ "=" BWS ( token / quoted-string ) ]

And here for the content of the rel param:

  relation-type *( 1*SP relation-type )
where:

  relation-type  = reg-rel-type / ext-rel-type
  reg-rel-type   = LOALPHA *( LOALPHA / DIGIT / "." / "-" )
  ext-rel-type   = URI ; Section 3 of [RFC3986]

See https://httpwg.org/specs/rfc8288.html

So there can be multiple values in the rel param, even URIs and quoted URIs.

Some of the errors can't be fixed without breaking b-w compatibility, e.g. the decoding of href, unescaping of params and multiple rel values. Also, I'm not sure why custom parameters aren't supported - I didn't read the whole spec, but I don't think it prohibits custom params (we don't use them, just wondering).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions