Skip to content

$recursiveRef part 2 of 3: $recursiveRef and $recursiveAnchor #654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 13, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 249 additions & 32 deletions jsonschema-core.xml
Original file line number Diff line number Diff line change
Expand Up @@ -205,11 +205,11 @@
The schemas to be applied may be present as subschemas comprising all or
part of the keyword's value. Alternatively, an applicator may refer to
a schema elsewhere in the same schema document, or in a different one.
The mechanism for identifying such referred schemas is defined by the
The mechanism for identifying such referenced schemas is defined by the
keyword.
</t>
<t>
Applicator keywords also define how subschema or referred schema
Applicator keywords also define how subschema or referenced schema
boolean <xref target="assertions">assertion</xref>
results are modified and/or combined to produce the boolean result
of the applicator. Applicators may apply any boolean logic operation
Expand Down Expand Up @@ -626,20 +626,19 @@
and used with caution when defining additional keywords.
</t>
</section>
<section title="Referred and Referring Schemas" anchor="referred">
<section title="Referenced and Referencing Schemas" anchor="referenced">
<t>
As noted in <xref target="applicators" />, an applicator keyword may
refer to a schema to be applied, rather than including it as a
subschema in the applicator's value. In such situations, the
schema being applied is known as the referred (or referenced) schema,
while the schema containing the applicator keyword is the referring
(or referencing) schema.
schema being applied is known as the referenced schema, while
the schema containing the applicator keyword is the referencing schema.
</t>
<t>
While root schemas and subschemas are static concepts based on a
schema's position within a schema document, referred and referring
schema's position within a schema document, referenced and referencing
schemas are dynamic. Different pairs of schemas may find themselves
in various referred and referring arrangements during the evaluation
in various referenced and referencing arrangements during the evaluation
of an instance against a schema.
</t>
<t>
Expand Down Expand Up @@ -1006,35 +1005,253 @@
</section>
</section>

<section title='Schema References With "$ref"' anchor="ref">
<section title="Schema References">
<t>
The "$ref" keyword can be used to reference a schema which is to be applied to the
current instance location. "$ref" is an applicator key word, applying the referred
schema to the instance.
Several keywords can be used to reference a schema which is to be applied to the
current instance location. "$ref" and "$recursiveRef" are an applicator
keywords, applying the referenced schema to the instance. "$recursiveAnchor"
is a helper keyword that controls how the referenced schema of "$recursiveRef"
is determined.
</t>
<t>
The value of the "$ref" property MUST be a string which is a URI Reference.
Resolved against the current URI base, it identifies the URI of a schema to use.
As the value of "$ref" and "$recursiveRef" are URI References, this allows
the possibility to externalise or divide a schema across multiple files,
and provides the ability to validate recursive structures through
self-reference.
</t>
<t>
As the value of "$ref" is a URI Reference, this allows the possibility to externalise or
divide a schema across multiple files, and provides the ability to validate recursive structures
through self-reference.
</t>
<t>
The URI is not a network locator, only an identifier. A schema need not be
downloadable from the address if it is a network-addressable URL, and
implementations SHOULD NOT assume they should perform a network operation when they
encounter a network-addressable URI.
</t>
<t>
A schema MUST NOT be run into an infinite loop against a schema. For example, if two
schemas "#alice" and "#bob" both have an "allOf" property that refers to the other,
a naive validator might get stuck in an infinite recursive loop trying to validate
the instance.
Schemas SHOULD NOT make use of infinite recursive nesting like this; the behavior is
undefined.
The resolved URI produced by these keywords is not necessarily a network
locator, only an identifier. A schema need not be downloadable from the
address if it is a network-addressable URL, and implementations SHOULD NOT
assume they should perform a network operation when they encounter
a network-addressable URI.
</t>

<section title='Direct References with "$ref"' anchor="ref">
<t>
The "$ref" keyword is used to reference a statically identified schema.
</t>
<t>
The value of the "$ref" property MUST be a string which is a URI Reference.
Resolved against the current URI base, it identifies the URI of a schema
to use.
</t>
</section>

<section title='Recursive References with "$recursiveRef" and "$recursiveAnchor"'>
<t>
The "$recursiveRef" and "$recursiveAnchor" keywords are used to construct
extensible recursive schemas. A recursive schema is one that has
a reference to its own root, identified by the empty fragment
URI reference ("#").
</t>
<t>
Extending a recursive schema with "$ref" alone involves redefining all
recursive references in the source schema to point to the root of the
extension. This produces the correct recursive behavior in the extension,
which is that all recursion should reference the root of the extension.
</t>
<figure>
<preamble>
Consider the following two schemas. The first schema, identified
as "original" as it is the schema to be extended, describes
an object with one string property and one recursive reference
property, "r". The second schema, identified as "extension",
references the first, and describes an additional things" property,
which is an array of recursive references.
It also repeats the description of "r" from the original schema.
</preamble>
<artwork>
<![CDATA[
{
"$schema": "http://json-schema.org/draft-08/schema#",
"$id": "https://example.com/original",

"properties": {
"name": {
"type": "string"
},
"r": {
"$ref": "#"
}
}
}

{
"$schema": "http://json-schema.org/draft-08/schema#",
"$id": "https://example.com/extension",

"$ref": "original",
"properties": {
"r": {
"$ref": "#"
},
"things": {
"type": "array"
"items": {
"$ref": "#"
}
}
}
}
]]>
</artwork>
<postamble>
This apparent duplication is important because
it resolves to "https://example.com/extension#", meaning that
for instance validated against the extension schema, the value
of "r" must be valid according to the extension, and not just the
original schema as "r" was described there.
</postamble>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking my understanding:
The problem is you want the $ref in the base schema to reference the not just itself statically, but the root schema, whatever the root schema is when referneced by another schema, as if it as transclueded / included. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um... maybe? The whole transcluded/included thing does not really help me.

</figure>
<t>
This approach is fine for a single recursive field, but the more
complicated the original schema, the more redefinitions are necessary
in the extension. This leads to a verbose and error-prone extension,
which must be kept synchronized with the original schema if the
original changes its recursive fields.
This approach can be seen in the meta-schema for JSON Hyper-Schema
in all prior drafts.
</t>
<section title='Enabling Recursion with "$recursiveAnchor"'>
<t>
The desired behavior is for the recursive reference, "r", in the
original schema to resolve to the original schema when that
is the only schema being used, but to resolve to the extension
schema when using the extension. Then there would be no need
to redefine the "r" property, or others like it, in the extension.
</t>
<t>
In order to create a recursive reference, we must do three things:
<list>
<t>
In our original schema, indicate that the schema author
intends for it to be extensible recursively.
</t>
<t>
In our extension schema, indicate that it is intended
to be a recursive extension.
</t>
<t>
Use a reference keyword that explicitly activates the
recursive behavior at the point of reference.
</t>
</list>
These three things together ensure that all schema authors
are intentionally constructing a recursive extension, which in
turn gives all uses of the regular "$ref" keyword confidence
that it only behaves as it appears to, using lexical scoping.
</t>
<t>
The "$recursiveAnchor" keyword is how schema authors indicate
that a schema can be extended recursively, and be a recursive
schema. This keyword MAY appear in the root schema of a
schema document, and MUST NOT appear in any subschema.
</t>
<t>
The value of "$recursiveAnchor" MUST be of type boolean, and
MUST be true. The value false is reserved for possible future use.
</t>
</section>
<section title='Dynamically recursive references with "$recursiveRef"'>
<t>
The "$recursiveRef" keyword behaves identically to "$ref", except
that if the referenced schema has "$recursiveAnchor" set to true,
then the implementation MUST examine the dynamic scope for the
outermost (first seen) schema document with "$recursiveAnchor"
set to true. If such a schema document exists, then the target
of the "$recursiveRef" MUST be set to that document's URI, in
place of the URI produced by the rules for "$ref".
</t>
<t>
Note that if the schema referenced by "$recursiveRef" does not
contain "$recursiveAnchor" set to true, or if there are no other
"$recursiveAnchor" keywords set to true anywhere further back in
the dynamic scope, then "$recursiveRef"'s behavior is identical
to that of "$ref".
</t>
<figure>
<preamble>
With this in mind, we can rewrite the previous example:
</preamble>
<artwork>
<![CDATA[
{
"$schema": "http://json-schema.org/draft-08/schema#",
"$id": "https://example.com/original",
"$recursiveAnchor": true,

"properties": {
"name": {
"type": "string"
},
"r": {
"$recursiveRef": "#"
}
}
}

{
"$schema": "http://json-schema.org/draft-08/schema#",
"$id": "https://example.com/extension",
"$recursiveAnchor": true,

"$ref": "original",
"properties": {
"things": {
"type": "array"
"items": {
"$recursiveRef": "#"
}
}
}
}
]]>
</artwork>
<postamble>
Note that the "r" property no longer appears in the
extension schema. Instead, all "$ref"s have been changed
to "$recursiveRef"s, and both schemas have "$recursiveAnchor"
set to true in their root schema.
</postamble>
</figure>
<t>
When using the original schema on its own, there is no change
in behavior. The "$recursiveRef" does lead to a schema where
"$recursiveAnchor" is set to true, but since the original schema
is the only schema document in the dynamics scope (it references
itself, and does not reference any other schema documents), the
behavior is effectively the same as "$ref".
</t>
<t>
When using the extension schema, the "$recursiveRef" within
that schema (for the array items within "things") also effectively
behaves like "$ref". The extension schema is the outermost
dynamic scope, so the reference target is not changed.
</t>
<t>
In contrast, when using the extension schema, the "$recursiveRef"
for "r" in the original schema now behaves differently. Its
initial target is the root schema of the original schema document,
which has "$recursiveAnchor" set to true. In this case, the
outermost dynamic scope that also has "$recursiveAnchor" set to
true is the extension schema. So when using the extensions schema,
"r"'s reference in the original schema will resolve to
"https://example.com/extension#", not "https://example.com/original#".
</t>
</section>
</section>

<section title="Guarding Against Inifinite Recursion">
<t>
A schema MUST NOT be run into an infinite loop against an instance. For
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A schema MUST NOT be run into an infinite loop against an instance.

This doesn't really make sense. The explanation afterward does, but I don't get this sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I just copy-pasted that from wherever it was before, so if we want to consider reworking it please file a new issue on it. I'm fine with discussing it, it's just not changed here so I don't want to add to the PR.

example, if two schemas "#alice" and "#bob" both have an "allOf" property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the allOf usage from this example now that $ref can be used alongside other kewords?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still valid as written, so as with the previous comment if you want to figure out a better way to do this whole paragraph let's discuss in an issue or separate PR.

that refers to the other, a naive validator might get stuck in an infinite
recursive loop trying to validate the instance. Schemas SHOULD NOT make
use of infinite recursive nesting like this; the behavior is undefined.
</t>
</section>

<section title="Loading a referenced schema">
<t>
The use of URIs to identify remote schemas does not necessarily mean anything is downloaded,
Expand Down Expand Up @@ -1313,7 +1530,7 @@
The application can use the schema location path to determine which
values are which. The values in the feature's immediate "enabled"
property schema are more specific, while the values under the re-usable
schema that is referred to with "$ref" are more generic. The schema
schema that is referenced to with "$ref" are more generic. The schema
location path will show whether each value was found by crossing a
"$ref" or not.
</t>
Expand Down