contentSchema: type MUST be a schema #917

ssilverman · 2020-05-10T07:42:36Z

No description provided.

handrews · 2020-05-15T03:33:10Z

@ssilverman this should actually just say at the top of the contentSchema section that its value MUST be a schema, instead of saying that it contains a schema if blah blah blah.

The type of the keyword value is unaffected by the presence or absence of contentMediaType. The reason contentSchema is ignored without contentMediaType is that there's no way to know how to apply it if you can't figure out the media type. But that's independent of the value type.

Good catch noticing that I left the MUST out ,though.

handrews

Just a ping on my earlier comment, let me know if it's not clear what I'm asking for with the wording.

ssilverman · 2020-05-20T08:06:37Z

I'm still thinking about it...

ssilverman · 2020-05-20T08:10:47Z

The reason that I chose this language is because the first paragraph implies that the property is a schema if the instance is a string and "contentMediaType" is present. This seems to imply that the value may or may not be a schema if those two conditions aren't satisfied.

This also means that if "contentMediaType" is not present, then we don't really know what "contentSchema" can be. Maybe, then, all three paragraphs need to change.

ssilverman · 2020-05-20T08:19:22Z

Ok, I tried a re-word. Let me know what you think.

handrews · 2020-05-20T08:25:50Z

@ssilverman

the first paragraph implies that the property is a schema if the instance is a string and "contentMediaType" is present. This seems to imply that the value may or may not be a schema if those two conditions aren't satisfied.

While I understand why you're thinking about that, I think you're overthinking it 😄

No keywords change their data type requirements based on the instance. They may be ignored, but their syntax absolutely cannot be determined by the instance, or implementations couldn't validate and process schemas up front.

jsonschema-validation.xml

ssilverman · 2020-05-24T17:48:52Z

Just force pushed changes with this message: "contentSchema: type MUST be a schema and instance MUST follow the data model".

jsonschema-validation.xml

handrews · 2020-05-25T22:53:51Z

@ssilverman It still doesn't make sense to say SHOULD follow the data model. What does that even mean? It's completely media-type specific and application-specific. How that gets sorted out is not the responsibility of the specification.

There's no clear definition for what "follow the data model" means. There is a data model. And you have a document of some media type that you parsed out of a string based on contentMediaType and possibly contentEncoding. Does it follow the data model? I don't know, can the application figure out a way to map it into the data model? Then yes. Is that going to be the same for all applications? Possibly not. Mapping XML to a JSON-ish structure, for example, can be done in several ways. Does that mean XML follows the data model? How would you even know without knowing what mapping the application is using?

What about some media type we've never heard of? Does it follow the data model? That depends, can whoever is designing that application figure something out that works well enough? It's outside the scope of the spec.

The content* keywords are just information about content in the string. One of those pieces of information happens to be a schema. These pieces of information may or may not be accurate. JSON Schema can't enforce that. These keywords never fail validation, not even contentSchema.

contentSchema is just a schema that you can try to apply to the content, assuming you can parse it into something that allows that. We don't want to say anything that restricts possible creative ways people decide to do that. And we definitely do not want a situation where people keep dropping by here asking if media type X "follows the data model" and if so how that works. That is outside of our scope.

karenetheridge · 2020-05-25T22:57:42Z

And we definitely do not want a situation where people keep dropping by here asking if media type X "follows the data model" and if so how that works. That is outside of our scope.

The only possible answer to that, IMO, is "go write an extension for your implementation of choice and come back and let us know".

handrews · 2020-05-25T23:03:27Z

As a relevant example, I'm also working on a PR advising OpenAPI how to use these keywords in OAS 3.1, which will be the first version to support them. They have a few media types they want to explicitly map (most notably multipart/form-data but also some others), so OAS will explain how to do the mapping.

There are several OAS-specific complications there, and it is better for JSON Schema to leave a lot of room in this area than to try to nail anything down.

ssilverman · 2020-05-26T00:26:54Z

I'm still confused what you want here, for these reasons:

In your comment (contentSchema: type MUST be a schema #917 (comment)), you say that JSON Schemas apply to something that has the JSON data model.
What on earth is "contentSchema" for if not to apply to a JSON data model? By not even saying that the instance "SHOULD follow the JSON data model", that makes "contentSchema" meaningless as a JSON schema.
Why, then, is "contentSchema" even defined to "MUST" be a JSON schema"?

I've just removed the The string SHOULD be in a format compatible with the data model defined in the <xref target="json-schema">core JSON Schema</xref>. part, left in the "contentSchema MUST be a JSON schema part", and then force pushed.

I'm still very unclear on why "contentSchema" "MUST" be a JSON schema if it's not even going to be applied to something that doesn't follow the JSON data model.

ssilverman · 2020-05-26T00:32:32Z

I also removed the Also changed one other period to be outside its <xref> tag, for consistency. part.

handrews · 2020-05-26T01:10:04Z

@ssilverman I am sorry for the confusion. This is why we usually file issues first for anything beyond at typo, although long-time contributors sometimes go straight to PR for small things.

When you opened this PR I assumed you were just trying to make this change (as I just merged it), and there didn't seem to be a reason to go make you open an issue. We're not really sticklers for process here.

But as you poked at it you found more things to question, which is completely fine and in general a good thing. But these things came in piecemeal and I was trying to respond to them in between bits and pieces here and there, which is not great for giving the most coherent response. I probably should have asked for an issue laying out your overall concerns once things got at all more complicated. So I doubt my responses were ideal even though I did try to go back over things.

Unfortunately, since you edit and force-push the same commit every time, some previous parts of the conversation were not accessible (not that I could figure out, anyway). So I may have contradicted myself because I couldn't find the older comments (I don't always save email notifications). It's usually a good idea to add commits if there are unresolved comments attached to existing commits. Again, we're not sticklers about that, but it does make it easier to see how a long-running PR evolved. We really are not bothered by "extra" commits in the history.

To answer your question (I hope):

The value of the keyword contentSchema is part of our specification, and like all keywords, has MUST-level requirements on the form (even if for a keyword like const it allows everything so there are no MUST restrictions.
The instance data is instance data, and these keywords are not assertions. Therefore the instance data is not something on which we impose requirements that a JSON Schema implementation enforces.

This is why contentMediaType is written as:

If the instance is a string, this property indicates the media type of the contents of the string.

and there is no requirement that the string contents actually be a valid document of that media type. In that case, the wording is completely intentional.

As for

This keyword MAY be used with any media type that can be mapped into JSON Schema's data model.

it's just noting the possibility that it doesn't have to be application/json. That is mostly there to prevent people from enforcing a JSON media type when the schema is used. It doesn't do anything more than that, nor do we want it to.

I realize my previous response probably sounded odd since this does mention the data model, but again, it's hard to drop in and out as the scope of the PR shifts and sometimes I just miss things that are really obvious because I've stared at this spec for way too long that parts of it are just invisible to me at this point.

ssilverman · 2020-05-27T01:32:00Z

@handrews Just for posterity, I'll respond and clarify. Your points were all good ones, thank you. I thought this would be a simple PR but it appeared to have sparked more discussion. I'll make an issue next time first.

With regards to force pushing, I find that I tend to take on the development practices I've used most recently. I use force pushing for branches that will eventually get into the main branch so that there's a minimal number of changes. Since the reviewers in this org tend to merge the PR and not the submitter, keeping the commits as they should be, for less clutter, is necessary.

It is possible to look at prior comments and discussion from pre-force-pushed commits in these ways:

Click on any of the "Show resolved" lines in the discussion here. It may be the case that that discussion is marked as "Outdated", but you can still expand them.
Links into an "Outdated" discussion still work. Try this: Copy the link and paste into the address bar of a new browser tab or window. You'll find that it goes to the correct, even if outdated, comment. It's possible this works less well on mobile, which is why you may not see this. Try either a desktop view or a browser on a desktop.

It's my new understanding that contentSchema is really just a suggestion, with no proscriptions on the instance contents, even though you state that it's meant to be applied to a "JSON Schema data model". (For practice going to an "Outdated" discussion: #917 (comment) :) )

karenetheridge · 2020-05-28T04:28:37Z

Since the reviewers in this org tend to merge the PR and not the submitter, keeping the commits as they should be, for less clutter, is necessary.

FWIW, the reviewer can rebase (and squash) before merging. There is a GUI option to squash all commits, and it's not hard to rebase and squash selected commits together from the command line as well.

As a reviewer it's much easier to follow what's going on if changes aren't applied as squashed amendments while the conversation is still going on.

One can put the PR into a work-in-progress state which prevents merges until the flag is lifted, or even just comment "let me know before merging so I have a chance to clean up the commits". Additionally, it's ugly when a pull request is merged without rebasing the commits that have gone into master in the meantime, so the author can rebase as well.

ssilverman · 2020-05-28T06:23:36Z

I’m not sure how your comments are relevant. It’s advantageous to reduce the number of commits in a PR. This has nothing to do with rebasing. And anyone can go back and see context even with force pushes.

Let me spell it out: If reviewers here are the ones merging the commits (and, so you don't pick on "merge" I mean "get the code into the main branch", I'm just using "merge" as shorthand to refer to the concept), then all the commits in the PR will get pushed, even temporary ones which the original submitter should really squash together into related commits. This means that the person creating the PR needs to either put it into a state that's push-ready or just set the whole thing to "draft". Force pushing is one way to do that (maybe I need to pre-empt what I feel is a future comment on this: yes, it's a bad idea to force-push to the main branch, I'm not saying that). And again, you don't lose context and anyone can go back and see old comments and their context; just click on the "Resolved" links and the comment tree will expand.

There's really no need to explain all these concepts here, I don't even know why I'm responding to this, other than from annoyance. There's no need for a tutorial, and anyone that's done any sort of work for years with this stuff (git and shared code and shared development techniques) may or may not do things differently.

FWIW, the reviewer can rebase (and squash) before merging. There is a GUI option to squash all commits, and it's not hard to rebase and squash selected commits together from the command line as well.

I have no idea why it's necessary to state this. This was my whole point. That's what I did with this PR, "squashed selected commits together". I don't know the reason for this play-by-play commentary. It keeps flipping back and forth from what seems like criticism to answering the criticism with what's already been done or what's happening currently. This is how the comments are coming across. I have no doubt you know what you're doing, but so do many people here.

handrews · 2020-05-28T06:48:42Z

@ssilverman @karenetheridge all of this is a matter of style. We often do multi-commit PRs to make reviews easier in various ways. They are rarely squashed unless someone wants to do so.

No one's going to force you to split commits unless something is incredibly difficult to review otherwise (this is usually major reworkings with multiple steps). But please do not try to enforce style from other projects here.

ssilverman · 2020-05-28T06:51:52Z

@handrews For the record, I'm not trying to enforce style. I was feeling questioned and so was explaining my reasoning. I can assure you I'm very aware that there's more than one way to do things here. Please don't suggest I'm unaware of this or trying to enforce anything.

Do I really have to respond explaining that "there are many ways to do things"? Really? There isn't just one way. I really don't appreciate that your response was directed my way.

I was squashing because I wanted to, per your comment. That was being questioned and then what followed was a clear "tutorial" on how to do things. I'm tired of that.

handrews · 2020-05-28T06:56:44Z

@ssilverman It was a request in response to the overall conversation. It was not an accusation, nor was it specifically directed at you. I don't even think it would be a bad thing to bring in style, so I'm not sure what I'm supposed to have accused you of.

karenetheridge · 2020-05-28T15:33:20Z

My comments were purely in response to the fact that force-pushing before the review phase has completed makes it harder to see the history (because previous commits are lost, even if the comments attached to them are not), so I was suggesting that one not force-push until ready to merge. I was not presuming the git ability level of anyone because people come from all levels of knowledge and experience and that's okay. That's all.

handrews added this to the draft-08-patch1 milestone May 15, 2020

handrews suggested changes May 18, 2020

View reviewed changes

handrews added the annotation label May 18, 2020

ssilverman force-pushed the shawn/contentSchema branch from d01b228 to eed747c Compare May 20, 2020 08:04

ssilverman force-pushed the shawn/contentSchema branch from eed747c to 1f0d59a Compare May 20, 2020 08:19

ssilverman requested a review from handrews May 20, 2020 08:20

handrews suggested changes May 20, 2020

View reviewed changes

jsonschema-validation.xml Outdated Show resolved Hide resolved

jsonschema-validation.xml Outdated Show resolved Hide resolved

ssilverman force-pushed the shawn/contentSchema branch 3 times, most recently from 0c5007a to 437499d Compare May 24, 2020 17:38

ssilverman changed the title ~~contentSchema: type MUST be a schema if it is not ignored~~ contentSchema: type MUST be a schema May 24, 2020

ssilverman force-pushed the shawn/contentSchema branch from 437499d to a53fb20 Compare May 24, 2020 17:48

ssilverman changed the title ~~contentSchema: type MUST be a schema~~ contentSchema: type MUST be a schema and instance MUST follow the data model May 24, 2020

ssilverman requested a review from handrews May 24, 2020 17:49

ssilverman force-pushed the shawn/contentSchema branch from 9c85128 to a53fb20 Compare May 25, 2020 00:18

karenetheridge reviewed May 25, 2020

View reviewed changes

jsonschema-validation.xml Outdated Show resolved Hide resolved

karenetheridge reviewed May 25, 2020

View reviewed changes

jsonschema-validation.xml Outdated Show resolved Hide resolved

ssilverman force-pushed the shawn/contentSchema branch from a53fb20 to a0d8ac2 Compare May 25, 2020 03:09

handrews reviewed May 25, 2020

View reviewed changes

jsonschema-validation.xml Outdated Show resolved Hide resolved

ssilverman force-pushed the shawn/contentSchema branch 3 times, most recently from c63e873 to 9bcfeb0 Compare May 25, 2020 22:20

ssilverman changed the title ~~contentSchema: type MUST be a schema and instance MUST follow the data model~~ contentSchema: type MUST be a schema and instance SHOULD follow the data model May 25, 2020

ssilverman requested a review from handrews May 25, 2020 22:21

ssilverman force-pushed the shawn/contentSchema branch from 9bcfeb0 to fb44094 Compare May 26, 2020 00:27

ssilverman changed the title ~~contentSchema: type MUST be a schema and instance SHOULD follow the data model~~ contentSchema: type MUST be a schema May 26, 2020

ssilverman force-pushed the shawn/contentSchema branch from fb44094 to 1e4f32b Compare May 26, 2020 00:30

contentSchema: type MUST be a schema

c14754e

ssilverman force-pushed the shawn/contentSchema branch from 1e4f32b to c14754e Compare May 26, 2020 00:31

handrews approved these changes May 26, 2020

View reviewed changes

handrews merged commit 339b46c into json-schema-org:master May 26, 2020

ssilverman deleted the shawn/contentSchema branch May 27, 2020 01:24

json-schema-org locked as too heated and limited conversation to collaborators May 28, 2020

json-schema-org unlocked this conversation May 28, 2020

Uh oh!

contentSchema: type MUST be a schema #917

contentSchema: type MUST be a schema #917

Uh oh!

Conversation

ssilverman commented May 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrews commented May 15, 2020

Uh oh!

handrews left a comment

Choose a reason for hiding this comment

Uh oh!

ssilverman commented May 20, 2020

Uh oh!

ssilverman commented May 20, 2020

Uh oh!

ssilverman commented May 20, 2020

Uh oh!

handrews commented May 20, 2020

Uh oh!

Uh oh!

Uh oh!

ssilverman commented May 24, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

handrews commented May 25, 2020

Uh oh!

karenetheridge commented May 25, 2020

Uh oh!

handrews commented May 25, 2020

Uh oh!

ssilverman commented May 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssilverman commented May 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrews commented May 26, 2020

Uh oh!

ssilverman commented May 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karenetheridge commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssilverman commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrews commented May 28, 2020

Uh oh!

ssilverman commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrews commented May 28, 2020

Uh oh!

karenetheridge commented May 28, 2020

Uh oh!

Uh oh!

ssilverman commented May 10, 2020 •

edited

Loading

ssilverman commented May 26, 2020 •

edited

Loading

ssilverman commented May 26, 2020 •

edited

Loading

ssilverman commented May 27, 2020 •

edited

Loading

karenetheridge commented May 28, 2020 •

edited

Loading

ssilverman commented May 28, 2020 •

edited

Loading

ssilverman commented May 28, 2020 •

edited

Loading