Skip to content

Commit a93b605

Browse files
committed
Canonical URIs and avoiding shadowed JSON Pointers
This implements the recent decision to strongly discourage using fragments that cross a base URI change (e.g. an "$id" appearance). This is done by describing schemas identified by absolute URIs as resources (per the literal definition of Univeral Resource Identifier), which was done in the previous commit, and declaring the URI resulting from the "$id" to be the resource's canonical URI. While URIs for subschemas with "$id" and their chidren can be constructed using the base URI from a parent schema, these URIs are non-canonical, and their behavior is undefined. This is on the grounds that switching between embedding and referencing schema resources should behave essentially identically. A difficulty with how annotation collection works in the event of such as switch is noted in a CREF.
1 parent eeec848 commit a93b605

File tree

1 file changed

+72
-43
lines changed

1 file changed

+72
-43
lines changed

jsonschema-core.xml

Lines changed: 72 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1467,6 +1467,11 @@
14671467
identified by <xref target="RFC3986">URI</xref>, and can embed references
14681468
to other schemas by specifying their URI.
14691469
</t>
1470+
<t>
1471+
Several keywords can accept a relative <xref target="RFC3986">URI-reference</xref>,
1472+
or a value used to construct a relative URI-reference. For these keywords,
1473+
it is necessary to establish a base URI in order to resolve the reference.
1474+
</t>
14701475

14711476
<section title="Initial Base URI" anchor="initial-base">
14721477
<t>
@@ -1498,63 +1503,87 @@
14981503

14991504
<section title='The "$id" Keyword' anchor="id-keyword">
15001505
<t>
1501-
The "$id" keyword defines a URI for the schema, and the base URI that
1502-
other URI references within the schema are resolved against.
1503-
A subschema's "$id" is resolved against the base URI of its parent schema.
1504-
If no parent schema defines an explicit base URI with "$id", the base URI
1505-
is that of the entire document, as determined per
1506-
<xref target="RFC3986">RFC 3986 section 5</xref>.
1506+
The "$id" keyword identifies a schema resource with its
1507+
<xref target="RFC6596">canonical</xref> URI.
1508+
</t>
1509+
<t>
1510+
Note that this URI is an identifier and not necessarily a network locator.
1511+
In the case of a network-addressable URL, a schema need not be downloadable
1512+
from its canonical URI.
15071513
</t>
15081514
<t>
15091515
If present, the value for this keyword MUST be a string, and MUST represent a
1510-
valid <xref target="RFC3986">URI-reference</xref>.
1511-
This value SHOULD be normalized, and SHOULD NOT be an empty fragment &lt;#&gt;
1512-
or an empty string &lt;&gt;.
1516+
valid <xref target="RFC3986">URI-reference</xref>. This URI-reference
1517+
SHOULD be normalized, and MUST resolve to an
1518+
<xref target="RFC3986">absolute-URI</xref> (without a fragment).
1519+
</t>
1520+
<t>
1521+
This URI also serves as the base URI for relative URI-references in keywords
1522+
within the schema resource, in accordance with
1523+
<xref target="RFC3986">RFC 3986 section 5.1.1</xref> regarding base URIs
1524+
embedded in content.
1525+
</t>
1526+
<t>
1527+
The presence of "$id" in a subschema indicates that the subschema constitutes
1528+
a distinct schema resource within a single schema document. Furthermore,
1529+
in accordance with <xref target="RFC3986">RFC 3986 section 5.1.2</xref>
1530+
regarding encapsulating entities, if an "$id" in a subschema is a relative
1531+
URI-reference, the base URI for resolving that reference is the URI of
1532+
the parent schema resource.
1533+
</t>
1534+
<t>
1535+
If no parent schema object explicitly identifies itself as a resource
1536+
with "$id", the base URI is that of the entire document, as established
1537+
by the steps given in the <xref target="initial-base">previous section.</xref>
15131538
</t>
15141539
<section title="Identifying the root schema">
15151540
<t>
1516-
The root schema of a JSON Schema document SHOULD contain an "$id" keyword with
1517-
an <xref target="RFC3986">absolute-URI</xref> (containing a scheme, but no fragment),
1518-
or this absolute URI but with an empty fragment.
1519-
<!-- All of the standard meta-schemas use an empty fragment in their id/$id values. -->
1541+
The root schema of a JSON Schema document SHOULD contain an "$id" keyword
1542+
with an <xref target="RFC3986">absolute-URI</xref> (containing a scheme,
1543+
but no fragment).
15201544
</t>
15211545
</section>
1522-
<section title="Changing the base URI within a schema file">
1546+
<section title="JSON Pointer fragments and embedded schema resources">
15231547
<t>
1524-
When an "$id" sets the base URI, the object containing that "$id" and all of
1525-
its subschemas can be identified by using a JSON Pointer fragment starting
1526-
from that location. This is true even of subschemas that further change the
1527-
base URI. Therefore, a single subschema may be accessible by multiple URIs,
1528-
each consisting of base URI declared in the subschema or a parent, along with
1529-
a JSON Pointer fragment identifying the path from the schema object that
1530-
declares the base to the subschema being identified. Examples of this are
1531-
shown in section <xref target="idExamples" format="counter"></xref>.
1548+
Since JSON Pointer URI fragments are constructed based on the structure
1549+
of the schema document, an embedded schema resource and its subschemas
1550+
can be identified by JSON Pointer fragments relative to either its own
1551+
canonical URI, or relative to the containing resource's URI.
15321552
</t>
1533-
</section>
1534-
<section title="Location-independent identifiers">
15351553
<t>
1536-
Using JSON Pointer fragments requires knowledge of the structure of the schema.
1537-
When writing schema documents with the intention to provide re-usable
1538-
schemas, it may be preferable to use a plain name fragment that is not tied to
1539-
any particular structural location. This allows a subschema to be relocated
1540-
without requiring JSON Pointer references to be updated.
1554+
Conceptually, a set of linked schemas should behave identically whether
1555+
each schema is a separate document connected with
1556+
<xref target="references">schema references</xref>, or is structured as
1557+
a single document with one or more schema resources embedded as
1558+
subschemas.
1559+
<cref>
1560+
Note that when using schema references, the reference keyword
1561+
appears in the runtime path in the standard output format for
1562+
errors and annotations. This means that while the validation
1563+
outcome is unchanged when switching between an embedded schema
1564+
resource and a referenced one, the runtime paths of annotations
1565+
do change. A future draft may allow directly replacing the value
1566+
of the reference keyword with its target while leaving the keyword
1567+
itself in place in order to make embedding vs referencing transparent
1568+
to annotation collection. This would allow replacing
1569+
{"$ref": "/foo"} with {"$ref": {"type": "string"}}, assuming
1570+
the schema at "verb">"/foo" consists of just a "type" keyword with
1571+
value "string". Feedback on this topic is highly encouraged.
1572+
</cref>
15411573
</t>
15421574
<t>
1543-
To specify such a subschema identifier,
1544-
the "$id" keyword is set to a URI reference with a plain name fragment (not a JSON Pointer fragment).
1545-
This value MUST begin with the number sign that specifies a fragment ("#"),
1546-
then a letter ([A-Za-z]),
1547-
followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"),
1548-
colons (":"), or periods (".").
1575+
Since URIs involving JSON Pointer fragments relative to the parent
1576+
schema resource's URI cease to be valid when the embedded schema
1577+
is moved to a separate document and referenced, applications and schemas
1578+
SHOULD NOT use such URIs to identify embedded schema resources or
1579+
locations within them. The effect of using such URIs is undefined.
1580+
Implementations MAY produce an error requiring that the canonical
1581+
URI for the embedded resource be used.
15491582
</t>
15501583
<t>
1551-
The effect of using a fragment in "$id" that isn't blank or doesn't follow the
1552-
plain name syntax is undefined.
1553-
<cref>
1554-
How should an "$id" URI reference containing a fragment with other components
1555-
be interpreted? There are two cases: when the other components match
1556-
the current base URI and when they change the base URI.
1557-
</cref>
1584+
Examples of such URIs with undefined behavior, as well as the appropriate
1585+
canonical URIs to use instead, are provided in section
1586+
<xref target="idExamples" format="counter"></xref>.
15581587
</t>
15591588
</section>
15601589
<section title="Schema identification examples" anchor="idExamples">
@@ -1639,7 +1668,7 @@
16391668
</section>
16401669
</section>
16411670

1642-
<section title="Schema References">
1671+
<section title="Schema References" anchor="references">
16431672
<t>
16441673
Several keywords can be used to reference a schema which is to be applied to the
16451674
current instance location. "$ref" and "$recursiveRef" are applicator

0 commit comments

Comments
 (0)