Skip to content

core: Define how to apply subschemas to child instances #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
handrews opened this issue Nov 23, 2016 · 12 comments
Closed

core: Define how to apply subschemas to child instances #161

handrews opened this issue Nov 23, 2016 · 12 comments

Comments

@handrews
Copy link
Contributor

In #157 @awwright noted that:

Technically, "properties" and "items" are just keywords for validation. All they do is return true or false.
People have been reading in that we can recursively apply the semantics, and that's fine too. Maybe we should make that more explicit.

Without ways to declare which subschemas apply to which child instances, JSON Schema as a media type is very limited.

In theory, it should be possible to use some JSON Schema vocabulary without validation, which is not possible at the moment. We don't have such a vocabulary right now, but to demonstrate the idea, here is a hyper-schema that imposes no validation constraints:

{
    "links": [
        {
            "rel": "self",
            "href": "/foos"
        }
    ],
    "items": {
        "links": [
            {
                "rel": "item",
                "href": "/foos/{id}"
            }
        ]
    }
}

This schema uses items from the validation vocabulary, but only to apply the "item" link to each element of an array instance. The URI template implies that the elements of the array are objects with at least an "id" property, but this schema does not validate that at all. Obviously, this is not the most realistic schema, but we don't have other vocabularies right now and I can't come up with one off the top of my head.


I recently rewrote the items, additionalItems, properties, patternProperties, and additionalProperties to make it clear that they do not directly validate the instance, but instead control which subschemas are applied to which child instances. While I expressed that in terms of validation, reading back over the language shows that it is not necessary.

We could move those keywords into the core specification and simply reference their impact on validation in the validation specification. There would not be any changes to the current functionality, but it would allow for much more flexible definitions of new vocabularies.

This is all a bit abstract, and probably hard to grasp from just eyeballing the current validation spec wording. So I will write up a trial PR to demonstrate how that would look.

handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 23, 2016
This is a trial balloon for fixing json-schema-org#161.

This moves several keywords that control which subschemas apply
to which child instances from validation to core.  The only
actual validation that these keywords defined was that all child
instances need to validate against all applicable subschemas.

This simple child validation principle has been moved up to
the general considerations section of the validation spec.

The moved keywords are now prefixed with "$", which is what we've
discussed doing for all core keywords (Another potential core
keyword would be "$data", for instance, as it would be used in
multiple vocabularies depending on which proposals we accept,
if any.)

As with "id" vs "$id" and the boolean forms of
exclusiveMinimum/Maximum, the old forms are still allowed but
expected to be removed before RFC.  Migrating an old schema
to the new form is trivial, and was discussed before we decided
to rename "id".  It is equally applicable here.
@deuberger
Copy link

I second this issue. Reading over the latest schema at http://json-schema.org/latest/json-schema-validation.html is confusing to me now compared to what was there in early October. It almost seems like it would be valid to have invalid child instances since the specification doesn't require that child instances be validated it seems. Maybe I'm misunderstanding something, but it's definitely confusing to me one way or another.

@handrews
Copy link
Contributor Author

handrews commented Dec 9, 2016

@deuberger I rewrote the sections for items, additionalItems, properties, patternProperties, and additionalProperties within the validation spec in PR #111 , could you take a look and see if that is easier to understand? You can read it in the diff or check out the current master here and build the HTML form of the docs if you install xml2rfc.

@deuberger
Copy link

@handrews thanks. I took a look at what's in master as of commit e784d5d. Comments:

  • The "items" section is still a little confusing. In particular, it's not very clear that the keyword only effects child instances of arrays. "This keyword controls child instance validation," makes it sound like it applies to any child instance (that of arrays or objects). The next paragraph, explicitly mentions arrays, but the last paragraph is only implies it by saying "elements".
  • "additionalItems" has a similar issue in that nothing explicitly says it applies only to arrays even though it can be inferred.
  • Maybe I'm missing something, but it seems that setting "additionalItems" to false is no longer allowed? So how does one disallow additional items? By setting maxItems?
  • "properties", "patternProperties", and "additionalProperties" also have the same issue in that nothing explicitly says they apply only to child instances of objects.
  • The last paragraph in the "properties" section is a bit confusing. It says "the instance value" needs to validate against the corresponding schema, but its actually the value at that name within the instance (though maybe there's something better to call it). Maybe it should be "child instance for that name" like "patternProperties" uses.
  • "properties" and "patternProperties" use the term "empty object" instead of "empty schema" like elsewhere
  • It seems like the language for default values (e.g. "empty schema" or 0) varies from one keyword to the next. It might make sense to normalize all of them.

In general, it does seem that this version is much clearer with respect to how child instances are validated. Thanks!

handrews added a commit to handrews/json-schema-spec that referenced this issue Dec 13, 2016
This implements suggestions from discussions in issue json-schema-org#161.
@handrews
Copy link
Contributor Author

@deuberger thanks! I've submitted PR #193 incorporating most of this. Note also the "Validation of primitive types and child values" section setting up the general context.

For the stuff I did not change:

Maybe I'm missing something, but it seems that setting "additionalItems" to false is no longer allowed? So how does one disallow additional items? By setting maxItems?

booleans are now allowed anywhere a schema is allowed- this is covered in the core spec rather than validation. In addition to being inconsistent, the old approach made it seem like setting "additionalProperties" or "additionalItems" to false had some sort of special effect. It does not, it is simply that false always fails validation, so applying a schema of false to every property or item not directly specified has the effect of preventing any such property or item from being defined, as any defined value will fail validation.

It seems like the language for default values (e.g. "empty schema" or 0) varies from one keyword to the next. It might make sense to normalize all of them.

All five of the child validation-controlling keywords now say "empty schema" (not "empty object"). I'm not sure where you were seeing the 0. Keywords with numeric values take a zero default in some cases, but those must be 0 while schema keywords must take an object (or a boolean).

@Relequestual Relequestual added this to the draft-next (draft-6) milestone Dec 13, 2016
@deuberger
Copy link

deuberger commented Dec 13, 2016

@handrews: changes look great. Your explanation about false for a schema value makes sense.

Regarding my comment:

It seems like the language for default values (e.g. "empty schema" or 0) varies from one keyword to the next. It might make sense to normalize all of them.

I meant that in reference to all keywords, not just those related to arrays and schemas. Many of them have a line about the default value and the exact language changes slightly from one to the next. It's not a huge deal because I didn't notice any inaccuracies. I just meant that it might make sense to normalize as much as possible for consistency sake.

Thanks for the improvements!

@handrews
Copy link
Contributor Author

@deuberger

I just meant that it might make sense to normalize as much as possible for consistency sake.

See #171 for work on that.

@Relequestual
Copy link
Member

@deuberger Do you have any further comments or are you happy with #171 ?

@deuberger
Copy link

I think #171 is fine, so I think it's fine to close this issue once the PR is merged.

@awwright
Copy link
Member

My original understanding of this issue is this affects how we read annotations onto instances, since JSON Schema doesn't explicitly say how to recursively apply such assertions for array items and object properties. That's what I intended with the quote that @handrews provided, at least.

My notion of a fix would be something like explicitly saying that "properties" is for validation and metadata; or saying something like

The act of successfully validating a schema against an instance applies other schema annotations against that instance, even as part of a larger instance, like an item in an array or a property in an object.

I'm not sure if that phrasing gets my point across well though.

How does this line up with everyone's understanding? It seems like filing another issue or two may be in order.

@handrews
Copy link
Contributor Author

@awwright I think we're in the same general neighborhood of meaning.

But I'd like try to remove validation from the picture as much as possible. You do bring up a good point by observing that additional vocabularies (because it's not just annotations) only apply from schemas that successfully validate. So my "remove validation" feeling may not actually make sense. I will think about it more.

It's worth point out that things get really weird with not. What does it mean to have a link definition inside of a not? What if you have one not nested in the other, so the inner not is actually a positive validation by the time you sort it all out?

handrews added a commit to handrews/json-schema-spec that referenced this issue Dec 27, 2016
This implements suggestions from discussions in issue json-schema-org#161.
@handrews
Copy link
Contributor Author

I'm going to take this issue out of the draft 6 milestone- PR #171 and #193 address the things here that we want in draft 6, but the larger "should this be less tied to validation" issue will either be addressed later or dropped.

@handrews handrews modified the milestones: draft-future, draft-next (draft-6) Dec 30, 2016
handrews added a commit to handrews/json-schema-spec that referenced this issue Jan 5, 2017
This implements suggestions from discussions in issue json-schema-org#161.
handrews added a commit to handrews/json-schema-spec that referenced this issue Jan 23, 2017
This implements suggestions from discussions in issue json-schema-org#161.
@handrews
Copy link
Contributor Author

Some work got done on this, and I don't feel like the remaining ideas have traction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants