Skip to content

Commit 464b603

Browse files
committed
add post on updating vocabs
1 parent edebbe0 commit 464b603

File tree

5 files changed

+128
-9
lines changed

5 files changed

+128
-9
lines changed

.jekyll-metadata

16.3 KB
Binary file not shown.
Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
---
2+
title: "Why I'm Updating My JSON Schema Vocabularies"
3+
date: 2023-12-08 09:00:00 +1200
4+
tags: [json-schema, vocab, vocabulary]
5+
toc: true
6+
pin: false
7+
---
8+
9+
Both of the vocabularies defined by `json-everything` are getting a facelift.
10+
11+
- The data vocabulary is getting some new functionality.
12+
- The UniqueKeys vocabulary is being deprecated in favor of the new Array Extensions vocabulary.
13+
14+
I'm also doing a bit of reorganization with the meta-schemas, which I'll get into.
15+
16+
## Data vocabulary updates
17+
18+
The data vocabulary is actually in its second version already. I don't keep a link to the first version on the documentation site, but the [file](https://github.com/gregsdennis/json-everything-docs/blob/main/_docs/schema/vocabs/data.md) is still in the GitHub repo.
19+
20+
The second version (2022) clarified some things around how URIs were supposed to be resolved, improved how different data sources could be referenced more explicitly, and added support for Relative JSON Pointers. Most importantly, it disallowed the use of Core vocabulary keywords, which had previously allowed the formed schema to behave differently from its host, introducing some security risks.
21+
22+
This [new version](https://docs.json-everything.net/schema/vocabs/data-2023/) (2023) merely builds on the 2022 version by adding:
23+
24+
- the `optionalData` keyword, which functions the same as `data` except that if a reference fails to resolve that keyword is ignored rather than validation halting.
25+
- JSON Path references, which can collect data spread over multiple locations within the instance. I think this is really powerful; there's an example in the spec.
26+
27+
## Introducing the Array Extensions vocabulary
28+
29+
The `uniqueKeys` keyword needed some updates anyway. It was the first vocabulary extension I wrote, and some of the language updates that I made to the data vocabulary in its second edition never made it over here. But I didn't just want update language or URIs; I wanted a functional change.
30+
31+
However, the keyword itself doesn't really need to be changed. I think it's good as it is. So instead, I'm adding a new keyword, which means it can't just be the "unique keys" vocabulary anymore. It needs a new name that better reflects all of the defined functionality.
32+
33+
So I'm deprecating it and replacing it with the new [Array Extensions vocabulary](https://docs.json-everything.net/schema/vocabs/array-ext/), which does two things:
34+
35+
- cleans up some language around `uniqueKeys` without changing the functionality.
36+
- adds the `ordering` keyword to validate that items in an array are in an increasing or decreasing sequence based on one or more values within each item.
37+
38+
## Meta-schema rework
39+
40+
I've recently had a few discussions ([here](https://github.com/orgs/json-schema-org/discussions/510) and [here](https://github.com/orgs/json-schema-org/discussions/511)) with some JSON Schema colleagues regarding the "proper" way to make a meta-schema for a vocabulary, and it seems my original approach was a little shortsighted.
41+
42+
When I created my meta-schemas, I simply created a 2020-12 extension meta-schema. It's straight-forward and gets the job done, but it's not very useful if you want to extend 2020-12 with multiple vocabularies, e.g. if you want to use both Data and UniqueKeys.
43+
44+
```jsonc
45+
{
46+
"$id": "https://json-everything.net/meta/data-2022",
47+
"$schema": "https://json-schema.org/draft/2020-12/schema",
48+
"$vocabulary": {
49+
// <core vocabs>
50+
"https://json-everything.net/vocabs-data-2022": true
51+
},
52+
"$dynamicAnchor": "meta",
53+
"title": "Referenced data meta-schema",
54+
"allOf": [
55+
// reference the 2020-12 meta-schema
56+
{ "$ref": "https://json-schema.org/draft/2020-12/schema" }
57+
],
58+
"properties": {
59+
"data": {
60+
// data keyword definition
61+
},
62+
"optionalData": {
63+
// optionalData keyword definition (it's the same as data)
64+
}
65+
}
66+
}
67+
```
68+
69+
This isn't _wrong_, but it could be done better.
70+
71+
Instead of having a single meta-schema that both validate the keyword and extends 2020-12 to use the vocabulary, we separate those purposes. (Feels a lot like SRP to me.)
72+
73+
So now we have a vocabulary meta-schema, which only serves to validate that the keyword values are syntactically correct, and a separate draft meta-schema extension which references it.
74+
75+
The new Data vocabulary meta-schema look like this:
76+
77+
```jsonc
78+
{
79+
"$id": "https://json-everything.net/schema/meta/vocab/data-2023",
80+
"$schema": "https://json-schema.org/draft/2020-12/schema",
81+
"$defs": {
82+
"formedSchema": {
83+
// data keyword definition
84+
}
85+
},
86+
"title": "Referenced data meta-schema",
87+
"properties": {
88+
"data": { "$ref": "#/$defs/formedSchema" },
89+
"optionalData": { "$ref": "#/$defs/formedSchema" }
90+
}
91+
}
92+
```
93+
94+
The `$vocabulary`, `$dynamicAnchor`, and reference to the 2020-12 meta-schema have all been removed as they're not necessary to validate the syntax of the vocabulary's keywords.
95+
96+
And the new Data 2020-12 extension meta-schema is this:
97+
98+
```jsonc
99+
{
100+
"$id": "https://json-everything.net/schema/meta/data-2023",
101+
"$schema": "https://json-schema.org/draft/2020-12/schema",
102+
"$vocabulary": {
103+
// <core vocabs>
104+
"https://docs.json-everything.net/schema/vocabs/data-2023": true
105+
},
106+
"$dynamicAnchor": "meta",
107+
"title": "Data 2020-12 meta-schema",
108+
"allOf": [
109+
{ "$ref": "https://json-schema.org/draft/2020-12/schema" },
110+
{ "$ref": "https://json-everything.net/schema/meta/vocab/data-2023" }
111+
]
112+
}
113+
```
114+
115+
The keyword definition is removed and the vocab meta-schema is referenced. [That's how the 2020-12 meta-schemas did it](https://www.youtube.com/watch?v=9UzxfhRznpU), and it's much more reusable this way.
116+
117+
> The Array Extensions vocabulary meta-schemas are also built this new way.
118+
{: .prompt-info}
119+
120+
Now, if you want to create a 2020-12 meta-schema that also includes the new Array Extensions vocabulary, you can take the above, change the `$id`, and add a reference to the Array Vocabulary meta-schema. This approach allows schema authors to more easily mix and match vocabularies as they need for their application.
121+
122+
## I need validation
123+
124+
The new vocabularies are still a work-in-progress, but they're mostly complete for these versions. I don't think the Data vocabulary will evolve much more, but I do hope to continue adding to the Array Extensions vocabulary as new functionality is conceived and requested. (There's actually a really neat [concept](https://github.com/json-schema-org/json-schema-spec/issues/1323) from Austin Wright, one of the spec authors, regarding patterned item sequence validation.)
125+
126+
Questions and comments are welcome in the `json-everything` Github repository, or leave a comment down below.

_posts/2023/2030-11-09-updating-vocabs.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

run.bat

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
bundle exec jekyll serve
1+
bundle exec jekyll serve --incremental

run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
#!/bin/bash
22

3-
bundle exec jekyll serve
3+
bundle exec jekyll serve --incremental

0 commit comments

Comments
 (0)