Skip to content

OAS character encoding and operation "path" segment encoding #2119

@sebastien-rosset

Description

@sebastien-rosset

There seems to be some ambiguity about character encoding in the OAS 3.0.2 specification.

  1. The spec should clarify the set of allowed characters in the operation "path" segment (by referencing the proper RFC)
  2. The spec should clarify how these characters are supposed to be encoded in the spec.

The OAS specifies "YAML version 1.2 is RECOMMENDED" [...]. Therefore, the YAML character encoding is unicode code point. A YAML processor must support the UTF-16 and UTF-8 character encodings, as specified at https://yaml.org/spec/current.html#id2513364

Additionally, the OAS specifies "Keys used in YAML maps MUST be limited to a scalar string, as defined by the YAML Failsafe schema ruleset".

I don't see any other restriction about the allowed set of characters in the operation "path" segment. So IMO, the set of characters should be as specified in https://tools.ietf.org/html/rfc3986 Section 3.3:

  pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
  pct-encoded = "%" HEXDIG HEXDIG
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

This means the following is a valid path, per RFC 3986:

/store/orderAZa–z09-._~!$&'(abc)*+ , ;=:@%C5%92%C3%AB%F0%9F%8D%87%E2%84%AC:

It's not very clear what the character encoding should be. One way to interpret the spec is that the path segment is written as UTF-8 characters in the OAS spec; the OAS tool converts the characters to percent encoding, and for example the client performs URL encoding of the "path" segment when it serializes the HTTP request.
With this interpretation, the OAS could have the following path, and that path would be percent-encoded by the OAS tool.

/store/orderAZa–z09-._~!$&'(abc)*+ , ;=:@Œë🍇ℬ:

Alternatively, it could be required that the path segment must be written as URL-encoded characters in the OAS spec. At least one tool I know of makes this assumption, though it's not clear if characters encoding was even considered during the implementation. In that interpretation, the path should be written in the OAS spec as follows:

/store/orderAZa–z09-._~!$&'(abc)*+ , ;=:@%C5%92%C3%AB%F0%9F%8D%87%E2%84%AC:

Another possibility is authors can choose whether to escape paths or not, tools are responsible for applying proper escaping. Either way, it would be good to clarify and provide one or two examples.

There is also the potential issue of JSON and YAML character escaping and OAS path templating on top of the URL encoding.

/store/curly-%7B%7D/AZaz-09._~!$&'(abc)*+, ;=:@Œë🍇ℬ/order/{orderId}/.././c///g?c=3&a=1&b=9&c=0#target

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions