-
Notifications
You must be signed in to change notification settings - Fork 75
Add integer to the list of types supported by the schema object. #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@auspicacious thanks for this!
As one of the main people involved in re-aligning OAS and JSON Schema, I'm confident that any difference between OAS 3.1 and JSON Schema draft 2020-12 is unintentional and perhaps an error. If you'd like to file an issue in the OAI/OpenAPI-Specification repo we can maybe clean up that language in 3.1.1 and 3.2.0. As for your fix here, I think you are correct that "integer" should be included because it is talking about the |
… the schema field contains schema objects.
@handrews I let the scope creep up a bit, but I feel like this way of writing it is probably clearer. I have another question, which you can push over to an issue I'll try to create in the specification repo when I have a little more time, but is relevant here as well. For context, I was the author of this issue regarding integer formats eight years ago. I'm doing a little brushing up on the possibility I might be getting back into this space and I'm looking at the format registry you mentioned when closing that issue in January. The format registry seems to be defining two new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with doing some extra cleanup here, but it's very easy to step into murky territory so I'd try to be minimal about the changes.
## The Schema Object | ||
## The Schema Field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should stay "Schema Object" because while I think all fields that directly take a Schema Object are called schema
, I'm not 100% sure, and there are other fields that involve references to schema objects (mapping
in the Discriminator Object) as well. It's also not true in general that fields with the same name in different Objects have the same value, so it's best to talk about the Schema Object rather than the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so there's a few different concepts being conflated here, I think.
As I understand it, this document is titled "Content of Message Bodies" and is providing an overview of how you can use OpenAPI to describe the content of message bodies. If someone is reading these documents in sequence, I think that it is also the first place that they will encounter the use of schema to describe data.
So there are two separate concepts that need to be conveyed: the names of the fields that, together, make up the definition of the content of a message, and, also, the concept of a schema object, which can be used in many places.
I wrote this deliberately to separate the field called schema
used at this particular location in a content definition, and the idea of a schema object, which can be used in various different places. In other words, I wrote it this way for the same reason you are asking me not to write it this way, and I'm not clear on your mental model here.
This might be clearer if the "Media Type Object" section above were renamed to "Media Type Field" to help distinguish this more clearly and be more consistent with the "content field" section at the top.
If this is not an appropriate mental model, I think that some more work needs to be done to describe the vocabulary and mental model that is appropriate.
|
||
The [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) defines a data type which can be a primitive (integer, string, ...), an array or an object depending on its `type` field. | ||
The schema field holds a [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just be dropped if we keep "Schema Object" as the section header.
|
||
`type` is a string and its possible values are: `number`, `string`, `boolean`, `array` and `object`. Depending on the selected type a number of other fields are available to further specify the data format. | ||
Schema objects describe the structure of data, and may be nested to describe complex arrays and objects. They are most often used to describe JSON data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to remove the "Depending on the selected type..." language as that's not really how JSON Schema works (you can use all fields at all times, it's just that they don't all apply to every data type, and some combinations are nonsensical- but they are still technically valid).
I'd just leave it as "Schema objects describe the structure of data." The exact relationship between JSON Schema and non-JSON data varies a bit among JSON Schema and OpenAPI versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so delete the second sentence, but why not explain that schema objects are nested? That's how they're usually used. I know there's a few fields that only apply to the root, but for this non-normative introduction I think that might be a bit much to start with.
|
||
For example, for `string` types the length of the string can be limited with `minLength` and `maxLength`. Similarly, `integer` types, accept `minimum` and `maximum` values. No matter the type, if the amount of options for the data is limited to a certain set, it can be specified with the `enum` array. All these properties are listed in the [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) specification. | ||
Each schema object has a field called `type`, which defines the type of data expected. Six of the seven possible values for `type` correspond directly to JSON's types as defined in [RFC 8259](https://datatracker.ietf.org/doc/html/rfc8259): `string`, `number`, `boolean`, `null`, `object`, and `array`. However, OpenAPI defines an additional type called `integer`, which indicates a JSON number without a fraction or exponent part. In other words, if your schema object is of type `integer`, it tells your users to expect a JSON number that looks like `123`, `-123`, or `0`, but not `1.0` or `1.0e-2`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
integer
is defined by JSON Schema, not OpenAPI (we should ignore any inadvertent discrepancies between the two). Even in OAS 3.0, integer
as a type comes from JSON Schema even if the wording suggests otherwise.
I'm also hesitant to get into the exact definition of integer
as it changed in subtle ways between certain JSON Schema drafts. It's better to just let folks look at JSON Schema-related docs for that. For example, 1.0
is an integer in the JSON Schema drafts used for 3.1 and (Im 99% sure) 3.0. But (I think?) not in 2.0. I'd just note that JSON Schema adds an integer
type to its type
keyword for convenience and leave it at that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to save this for an issue, but, although I don't know the context that decision was made in, 1.0
really, really, really should not be considered an integer.
It will cause immediate problems for simple clients in, e.g., Python, if that is allowed:
>>> isinstance(json.loads('1'), int)
True
>>> isinstance(json.loads('1'), float)
False
>>> isinstance(json.loads('1.0'), int)
False
>>> isinstance(json.loads('1.0'), float)
True
and the OpenAPI definition is very clear that 1.0
is not an integer in the OpenAPI schema dialect, which is why I added the definition above.
For example, for `string` types the length of the string can be limited with `minLength` and `maxLength`. Similarly, `integer` types, accept `minimum` and `maximum` values. No matter the type, if the amount of options for the data is limited to a certain set, it can be specified with the `enum` array. All these properties are listed in the [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) specification. | ||
Each schema object has a field called `type`, which defines the type of data expected. Six of the seven possible values for `type` correspond directly to JSON's types as defined in [RFC 8259](https://datatracker.ietf.org/doc/html/rfc8259): `string`, `number`, `boolean`, `null`, `object`, and `array`. However, OpenAPI defines an additional type called `integer`, which indicates a JSON number without a fraction or exponent part. In other words, if your schema object is of type `integer`, it tells your users to expect a JSON number that looks like `123`, `-123`, or `0`, but not `1.0` or `1.0e-2`. | ||
|
||
Depending on the selected type, a number of other fields are available to further specify the data format. For example, for `string` types, the length of the string can be limited with `minLength` and `maxLength`. Similarly, `integer` types accept `minimum` and `maximum` values. No matter the type, if the amount of options for the data is limited to a certain set, it can be specified with the `enum` array. All these properties are listed in the [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this isn't phrasing you chose, but all fields are always available. Certain fields apply only to certain types. Putting a maxLength
field on an integer is a no-op, but it's not wrong. Also, minimum
and maximum
work with any number, not just integers. And be careful about type
and enum
- they always both apply, and it's easy to accidentally create an impossible-to-satisfy schema if you combine them carelessly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that a reasonable validator/linter of JSON Schema documents, if one exists, would catch those issues, and my understanding is that this is a non-normative document intended to show how the tools should be used.
Rewriting the first sentence in this way is probably ambiguous enough for this situation:
In addition to
type
, several more fields can be used to further constrain your data.
🤦 Nope. Those should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking it would be best to go back to the original PR that just adds a single word to the existing document, and maybe revisit the rest of it at a later point after the underlying standards have reached a consensus. Ideally this overview document should be about one good way of doing things, not many almost good ways.
## The Schema Object | ||
## The Schema Field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so there's a few different concepts being conflated here, I think.
As I understand it, this document is titled "Content of Message Bodies" and is providing an overview of how you can use OpenAPI to describe the content of message bodies. If someone is reading these documents in sequence, I think that it is also the first place that they will encounter the use of schema to describe data.
So there are two separate concepts that need to be conveyed: the names of the fields that, together, make up the definition of the content of a message, and, also, the concept of a schema object, which can be used in many places.
I wrote this deliberately to separate the field called schema
used at this particular location in a content definition, and the idea of a schema object, which can be used in various different places. In other words, I wrote it this way for the same reason you are asking me not to write it this way, and I'm not clear on your mental model here.
This might be clearer if the "Media Type Object" section above were renamed to "Media Type Field" to help distinguish this more clearly and be more consistent with the "content field" section at the top.
If this is not an appropriate mental model, I think that some more work needs to be done to describe the vocabulary and mental model that is appropriate.
|
||
`type` is a string and its possible values are: `number`, `string`, `boolean`, `array` and `object`. Depending on the selected type a number of other fields are available to further specify the data format. | ||
Schema objects describe the structure of data, and may be nested to describe complex arrays and objects. They are most often used to describe JSON data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so delete the second sentence, but why not explain that schema objects are nested? That's how they're usually used. I know there's a few fields that only apply to the root, but for this non-normative introduction I think that might be a bit much to start with.
|
||
For example, for `string` types the length of the string can be limited with `minLength` and `maxLength`. Similarly, `integer` types, accept `minimum` and `maximum` values. No matter the type, if the amount of options for the data is limited to a certain set, it can be specified with the `enum` array. All these properties are listed in the [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) specification. | ||
Each schema object has a field called `type`, which defines the type of data expected. Six of the seven possible values for `type` correspond directly to JSON's types as defined in [RFC 8259](https://datatracker.ietf.org/doc/html/rfc8259): `string`, `number`, `boolean`, `null`, `object`, and `array`. However, OpenAPI defines an additional type called `integer`, which indicates a JSON number without a fraction or exponent part. In other words, if your schema object is of type `integer`, it tells your users to expect a JSON number that looks like `123`, `-123`, or `0`, but not `1.0` or `1.0e-2`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to save this for an issue, but, although I don't know the context that decision was made in, 1.0
really, really, really should not be considered an integer.
It will cause immediate problems for simple clients in, e.g., Python, if that is allowed:
>>> isinstance(json.loads('1'), int)
True
>>> isinstance(json.loads('1'), float)
False
>>> isinstance(json.loads('1.0'), int)
False
>>> isinstance(json.loads('1.0'), float)
True
and the OpenAPI definition is very clear that 1.0
is not an integer in the OpenAPI schema dialect, which is why I added the definition above.
For example, for `string` types the length of the string can be limited with `minLength` and `maxLength`. Similarly, `integer` types, accept `minimum` and `maximum` values. No matter the type, if the amount of options for the data is limited to a certain set, it can be specified with the `enum` array. All these properties are listed in the [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) specification. | ||
Each schema object has a field called `type`, which defines the type of data expected. Six of the seven possible values for `type` correspond directly to JSON's types as defined in [RFC 8259](https://datatracker.ietf.org/doc/html/rfc8259): `string`, `number`, `boolean`, `null`, `object`, and `array`. However, OpenAPI defines an additional type called `integer`, which indicates a JSON number without a fraction or exponent part. In other words, if your schema object is of type `integer`, it tells your users to expect a JSON number that looks like `123`, `-123`, or `0`, but not `1.0` or `1.0e-2`. | ||
|
||
Depending on the selected type, a number of other fields are available to further specify the data format. For example, for `string` types, the length of the string can be limited with `minLength` and `maxLength`. Similarly, `integer` types accept `minimum` and `maximum` values. No matter the type, if the amount of options for the data is limited to a certain set, it can be specified with the `enum` array. All these properties are listed in the [Schema Object](https://spec.openapis.org/oas/v3.1.0#schema-object) specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that a reasonable validator/linter of JSON Schema documents, if one exists, would catch those issues, and my understanding is that this is a non-normative document intended to show how the tools should be used.
Rewriting the first sentence in this way is probably ambiguous enough for this situation:
In addition to
type
, several more fields can be used to further constrain your data.
@auspicacious should we be keeping this pull request open for further updates after the discussion? (I realise it has been a while!) |
@lornajane Thanks for checking in. I think that I have been out of the OpenAPI/JSON Schema world for so many years that I would not be able to contribute without investing a lot of time that I don't have available right now. I hope that the concerns I expressed above as I dug deeper into the issue were at least helpful, but it is unlikely that I will be making further updates to this PR. |
@auspicacious we ended up adding a small note about Regarding Thank you for your efforts here. I do hope that the 3.0.4 and 3.1.1 changes help. Since you won't be able to continue, I'm going to close this. |
While reading, I noticed that
integer
was not included in this list of possible types. Given thatinteger
is used extensively in this chapter, I assume this was a simple oversight.As an aside, though, I understand that the JSON Schema core specification draft draws a semantic difference between
integer
and the other types and that the OpenAPI specification explicitly addsinteger
as a type with a slightly different definition than the JSON Schema validation draft ("a JSON number without a fraction or exponent part" vs. "any number with a zero fractional part"). This is messy, but it doesn't seem worthwhile digging into these differences in this explainer.Maybe the OAS should be more explicit about this definitional difference, though?