Skip to content

Remove the additionalProperties and additionalItems keywords #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jdesrosiers opened this issue Aug 27, 2017 · 23 comments
Closed

Remove the additionalProperties and additionalItems keywords #373

jdesrosiers opened this issue Aug 27, 2017 · 23 comments

Comments

@jdesrosiers
Copy link
Member

JSON Schema validation has some interesting architectural properties. One of those properties is that keyword validations are stateless. They have no dependency on the results of the validation of any other keyword. This means keywords can be evaluated in any order, or even in parallel.

There are two exceptions to this rule: additionalProperties and additionalItems. additionalProperties causes all kinds of problems and confusion when schema authors compose schemas. "additionalProperties": false is used to disallow undefined properties (usually to detect typos), but it also makes it impossible to extend that schema using allOf. This is why I always recommend that additionalProperties not be used and that programs ignore undefined properties instead.

I think the only reason additionalItems isn't similarly notorious is because it is rarely used. Fundamentally, it has the same problems.

I'm not sure if the architectural drift is the reason additionalProperties and additionalItems are problematic or if the problem is intrinsic in what they are trying to do. But, I don't think we should be tacking on additional functionally to solve the problems that additionalProperties causes. Instead, we need to get rid of the cause of the problem and find a way to address the need of detecting typos in a way that doesn't violate the core architecture of JSON Schema. Until we have a good solution, I think it's fine to leave it up to the individual implementations to solve as they see fit. For example, json-schema-validator reports undefined properties as warnings. This seems like a perfectly good solution for most cases. It provides feedback to catch the kind of errors additionalProperties is being used to detect without restricting the composablity of schemas.

I don't think we end up loosing anything by allowing validators to give feedback about undefined properties. However, additionalProperties is also used for another purpose. It is used to describe an object with arbitrary keys whose values must all conform to the same schema. It is possible to describe this with patternProperties using .* as a key. What wouldn't be possible anymore is mixing these arbitrarily-named properties with named properties. I'm ok with that. I don't think it's a good idea to mix those things anyway. It seems a small thing to loose to get rid of something so problematic.

In case it's not clear what I mean ...

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "extraStuff": {
      "type": "object",
      "patternProperties": {
        ".*": { "type": "integer" }
      }
    }
  }
}
@handrews
Copy link
Contributor

handrews commented Aug 27, 2017

VOTE-A-RAMA!!!

It's time to gauge community support for all re-use/extension/additionalProperties proposals and actually make some decisions for the next draft.

Please use emoji reactions ON THIS COMMENT to indicate your support.

  • You do not need to vote on every proposal
  • If you have no opinion, don't vote- that is also useful data
  • If you've already commented on this issue, please still vote so we know your current thoughts
  • Not all proposals solve exactly the same problem, so we may end up implementing more than one

This is not a binding majority-rule vote, but it will be a very significant input into discussions.

Here are the meanings for the emojis:

  • Celebration / Hooray / Tada!: I support this so strongly that I want to be the primary advocate for it
  • Heart: I think this is an ideal solution
  • Smiley face: I'd be happy with this solution
  • Thumbs up: I'd tolerate this solution
  • Thumbs down: I'd rather we not do this, but it wouldn't be the end of the world
  • Frowny face: I'd be actively unhappy, and may even consider other technologies instead

If you want to explain in more detail, feel free to add another comment, but please also vote on this comment.

The vote should stay open for several weeks- I'll update this comment with specifics once we see how much feedback we are getting and whether any really obvious patterns emerge.

@handrews
Copy link
Contributor

@jdesrosiers thanks for filing! I'm going to take the comments proposing this out of the other issue to avoid confusion.

@Julian
Copy link
Member

Julian commented Aug 27, 2017

Just for posterity, though perhaps not directly relevant, $ref+$id are not stateless either, and way more annoying to implement in that sense, I'm not sure I've felt any pain from additionalProperties/additionalItems as an implementer.

@handrews
Copy link
Contributor

@Julian agreed. Validation keywords are stateless in the sense that they operate independently of the contents of any parent or child schema. Most are stateless with respect to adjacent keywords within the same subschema. "additionalProperties" and "additionalItems" are the exceptions.

I don't understand @jdesrosiers's "architectural drift" comment, as these keywords have been in the spec since the beginning.

@handrews
Copy link
Contributor

Also worth noting: the if/then/else keywords interact, see #375

@handrews
Copy link
Contributor

I do have a specific use case for "additionalProperties": false that I was forgetting about. It has a workaround, but I figure it is worth mentioning:

When you implement optional inlining of related resources, you end up with schemas like this:

{
  "oneOf": [
    {
      "properties": {"id": {"type": "integer"}},
      "additionalProperties": false
    },
    {"$ref": "the/whole/thing"}
  ]
}

You can also do this:

{
  "oneOf": [
    {
      "properties": {"id": {"type": "integer"}},
      "not": {"$ref": "the/whole/thing"}
    },
    {"$ref": "the/whole/thing"}
  ]
}

although depending on exactly what "the/whole/thing" defines, that may be a good bit harder to reason about.

Not compelling enough on its own, perhaps, but a useful pattern, and pretty much the only time I use "additionalProperties": false

@erayd
Copy link

erayd commented Aug 30, 2017

I really like being able to define a default schema for properties that aren't matched by something in properties or patternProperties. Ditto for array items. If additionalProperties / additionalItems are removed, I really, really want some other way of achieving the same result. Not terribly fussed as to what the syntax looks like as long as it's clear and logical, although for what it's worth, the current keywords already feel clear and logical.

@erayd
Copy link

erayd commented Aug 30, 2017

For the avoidance of doubt, I would consider the following solution viable, but a significant step backwards from the status quo, as there is no corresponding keyword for unmatched array items. At least with additionalProperties / additionalItems, the syntax is consistent.

{
  "anyOf": [
    {"properties": {"propertyOne": {"type": "integer"}}},
    {"patternProperties": {".*": {"type": "string"}}}
  ]
}

@handrews
Copy link
Contributor

I'd probably do something like

{
  "anyOf": [
    {"properties": {"propertyOne": {"type": "integer"}}},
    {
      "not": {"propertyNames": {"enum": ["propertyOne"]}},
      "patternProperties": {".*": {"type": "string"}}
    }
  ]
}

but I'd rather not.

@Relequestual
Copy link
Member

I really don't understand why this is a problem at all.

Also...

What wouldn't be possible anymore is mixing these arbitrarily-named properties with named properties. I'm ok with that. I don't think it's a good idea to mix those things anyway. It seems a small thing to loose to get rid of something so problematic.

This would totally break a current usecase. I'm working on an international data sharing and discovery API. It works like a federated network in all directions. We have a defined JSON structure. We want to allow endpoints to test additional fields, for which we prefix underscores, which may later become part of the spec. If I understand the above quoted, you would prohibit defining a schema for this.

@jdesrosiers
Copy link
Member Author

$ref+$id are not stateless either

Yep. I was only referring to validation keywords. I'd like the get rid of $id's ability to change $ref resolution scope as well. But, that's a whole other battle.

@jdesrosiers
Copy link
Member Author

I don't understand @jdesrosiers's "architectural drift" comment, as these keywords have been in the spec since the beginning.

Actually, the term "architectural erosion" is more accurate. But, yes, it is awkward to use any of these formal software architecture terms on something that doesn't have a formal architecture. These terms imply that there was a prescribed architecture at the start of the project. I was speaking as if the general architecture that has emerged was prescribed from the beginning and that the things that don't fit that pattern are either drift or erosion.

Perhaps "architectural mismatch" would have been a more descriptive term in this case.

@jdesrosiers
Copy link
Member Author

Also worth noting: the if/then/else keywords interact

It was my understanding that these would be sub-keywords, not keywords that operate independently. For example,

{
  "conditional": {
    "if": { ... },
    "then": { ... },
    "else": { ... }
  }

As long as conditional is stateless with respect to other keywords, I don't have a problem. But, it isn't important that if/then/else are stateless with respect to each other. They aren't really validation keywords by themselves. They are just part of a larger structure.

@handrews
Copy link
Contributor

handrews commented Sep 3, 2017

It was my understanding that these would be sub-keywords, not keywords that operate independently.

You clearly have not followed the issue or PR in the last 8+ months. I'm not getting into it here as this issue is not about those keywords.

@jdesrosiers
Copy link
Member Author

I did follow the issue and was part of the conversation. This was definitely what we discussed. I haven't had a chance to read the PR yet. I agree we should have this discussion there, not here.

@handrews
Copy link
Contributor

handrews commented Sep 3, 2017

There was much subsequent discussion and decision.

@levbishop
Copy link
Contributor

The suggested alternative:

{
  "anyOf": [
    {"properties": {"propertyOne": {"type": "integer"}}},
    {
      "not": {"propertyNames": {"enum": ["propertyOne"]}},
      "patternProperties": {".*": {"type": "string"}}
    }
  ]
}

is not a replacement for

{ "properties": {"propertyOne": {"type": "integer"}},
  "additionalProperties": {"type": "string"} }

since, for example, the former will validate but the latter fail on {"propertyOne": 1, "foo": 17}.

It could be done via patternProperties with a fancy regex:

{ "properties": {"propertyOne": {"type": "integer"}},
  "patternProperties": {"^(?!propertyOne$)": {"type": "string"}}}

Although it's a valid ECMA 262 regex, draft-wright-json-schema-validation-01 states it SHOULD be avoided due to inconsistent support for the (?! disjunction) form. Also, the regex gets really ugly for less-trivial cases, so I disagree with removal of additionalProperties.

@erayd
Copy link

erayd commented Sep 7, 2017

@levbishop

for example, the former will validate but the latter fail on {"propertyOne": 1, "foo": 17}.

I'm not seeing it - can you clarify please? They should be functionally identical, and both will fail validation because foo is not a string.

@erayd
Copy link

erayd commented Sep 7, 2017

Never mind, I see it. The first schema in anyOf doesn't constrain properties.

I don't think a fancy regex is needed though - wouldn't an enum do the trick?

@levbishop
Copy link
Contributor

I don't see a way to do it with enum. propertyNames allows to match the property name with a schema and patternProperties allows to match the property values with a schema, but there isn't a primitive for matching both:

{ "propertyNameValues": [
  [{"const": "propertyOne"}, {"type": "integer"}],
  [{"not": {"enum": ["propertyOne"]}}, {"type": "string"}]
] }

In the absence of such a propertyNameValues primitive, we have to do it with patternProperties and regexes.

@handrews
Copy link
Contributor

handrews commented Sep 7, 2017

@erayd @levbishop this proposal (removal of additionalProperties) has attracted much more and better articulated votes against than it has votes for. Following up elsewhere, the one smiley vote is not actually all that strong.

My expectation is to close this as non-viable, but I'm leaving it open at least to the weekend so that a few more people get a chance to vote.

@dlax
Copy link
Member

dlax commented Sep 8, 2017

Following up elsewhere, the one smiley vote is not actually all that strong.

Yes, it's mine. I confirm it's probably more of 👍 or a 👎, but not something I'd fight for.

@handrews
Copy link
Contributor

As noted above, and elaborated on in an email to the mailing list, despite getting some support, there is not enough, and the opposition is too strong, for this to be considered a viable proposal given its drastic nature. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants