Skip to content

Commit 1695d47

Browse files
committed
Add support for passing an explicit base URI
Include an acceptance test for --base-uri Update docs and add `--base-uri` to the end of usage docs.
1 parent e5b9f68 commit 1695d47

File tree

5 files changed

+152
-70
lines changed

5 files changed

+152
-70
lines changed

docs/usage.rst

Lines changed: 74 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,22 @@ Detailed helptext is always available interactively via
4040
the error and exit. Use ``--traceback-mode full`` to request the full traceback
4141
be printed, for debugging and troubleshooting.
4242

43-
Other Schema Options
44-
--------------------
43+
Environment Variables
44+
---------------------
45+
46+
The following environment variables are supported.
47+
48+
.. list-table:: Environment Variables
49+
:widths: 15 30
50+
:header-rows: 1
51+
52+
* - Name
53+
- Description
54+
* - ``NO_COLOR``
55+
- Set ``NO_COLOR=1`` to explicitly turn off colorized output.
56+
57+
Schema Selection Options
58+
------------------------
4559

4660
No matter what usage form is used, a schema must be specified.
4761

@@ -113,68 +127,6 @@ The following options control caching behaviors.
113127
- The name to use for caching a remote schema.
114128
Defaults to using the last slash-delimited part of the URI.
115129

116-
Environment Variables
117-
---------------------
118-
119-
The following environment variables are supported.
120-
121-
.. list-table:: Environment Variables
122-
:widths: 15 30
123-
:header-rows: 1
124-
125-
* - Name
126-
- Description
127-
* - ``NO_COLOR``
128-
- Set ``NO_COLOR=1`` to explicitly turn off colorized output.
129-
130-
Parsing Options
131-
---------------
132-
133-
``--default-filetype``
134-
~~~~~~~~~~~~~~~~~~~~~~
135-
136-
The default filetype to assume on instance files when they are detected neither
137-
as JSON nor as YAML.
138-
139-
For example, pass ``--default-filetype yaml`` to instruct that files which have
140-
no extension should be treated as YAML.
141-
142-
By default, this is not set and files without a detected type of JSON or YAML
143-
will fail.
144-
145-
``--data-transform``
146-
~~~~~~~~~~~~~~~~~~~~
147-
148-
``--data-transform`` applies a transformation to instancefiles before they are
149-
checked. The following transforms are supported:
150-
151-
- ``azure-pipelines``:
152-
"Unpack" compile-time expressions for Azure Pipelines files, skipping them
153-
for the purposes of validation. This transformation is based on Microsoft's
154-
lanaguage-server for VSCode and how it handles expressions
155-
156-
- ``gitlab-ci``:
157-
Handle ``!reference`` tags in YAML data for gitlab-ci files. This transform
158-
has no effect if the data is not being loaded from YAML, and it does not
159-
interpret ``!reference`` usages -- it only expands them to lists of strings
160-
to pass schema validation
161-
162-
``--fill-defaults``
163-
-------------------
164-
165-
JSON Schema specifies the ``"default"`` keyword as potentially meaningful for
166-
consumers of schemas, but not for validators. Therefore, the default behavior
167-
for ``check-jsonschema`` is to ignore ``"default"``.
168-
169-
``--fill-defaults`` changes this behavior, filling in ``"default"`` values
170-
whenever they are encountered prior to validation.
171-
172-
.. warning::
173-
174-
There are many schemas which make the meaning of ``"default"`` unclear.
175-
In particular, the behavior of ``check-jsonschema`` is undefined when multiple
176-
defaults are specified via ``anyOf``, ``oneOf``, or other forms of polymorphism.
177-
178130
"format" Validation Options
179131
---------------------------
180132

@@ -253,3 +205,61 @@ follows:
253205
always passes. Otherwise, check validity in the python engine.
254206
* - python
255207
- Require the regex to be valid in python regex syntax.
208+
209+
Other Options
210+
--------------
211+
212+
``--default-filetype``
213+
~~~~~~~~~~~~~~~~~~~~~~
214+
215+
The default filetype to assume on instance files when they are detected neither
216+
as JSON nor as YAML.
217+
218+
For example, pass ``--default-filetype yaml`` to instruct that files which have
219+
no extension should be treated as YAML.
220+
221+
By default, this is not set and files without a detected type of JSON or YAML
222+
will fail.
223+
224+
``--data-transform``
225+
~~~~~~~~~~~~~~~~~~~~
226+
227+
``--data-transform`` applies a transformation to instancefiles before they are
228+
checked. The following transforms are supported:
229+
230+
- ``azure-pipelines``:
231+
"Unpack" compile-time expressions for Azure Pipelines files, skipping them
232+
for the purposes of validation. This transformation is based on Microsoft's
233+
lanaguage-server for VSCode and how it handles expressions
234+
235+
- ``gitlab-ci``:
236+
Handle ``!reference`` tags in YAML data for gitlab-ci files. This transform
237+
has no effect if the data is not being loaded from YAML, and it does not
238+
interpret ``!reference`` usages -- it only expands them to lists of strings
239+
to pass schema validation
240+
241+
``--fill-defaults``
242+
~~~~~~~~~~~~~~~~~~~
243+
244+
JSON Schema specifies the ``"default"`` keyword as potentially meaningful for
245+
consumers of schemas, but not for validators. Therefore, the default behavior
246+
for ``check-jsonschema`` is to ignore ``"default"``.
247+
248+
``--fill-defaults`` changes this behavior, filling in ``"default"`` values
249+
whenever they are encountered prior to validation.
250+
251+
.. warning::
252+
253+
There are many schemas which make the meaning of ``"default"`` unclear.
254+
In particular, the behavior of ``check-jsonschema`` is undefined when multiple
255+
defaults are specified via ``anyOf``, ``oneOf``, or other forms of polymorphism.
256+
257+
``--base-uri``
258+
~~~~~~~~~~~~~~
259+
260+
``check-jsonschema`` defaults to using the ``"$id"`` of the schema as the base
261+
URI for ``$ref`` resolution, falling back to the retrieval URI if ``"$id"`` is
262+
not set.
263+
264+
``--base-uri`` overrides this behavior, setting a custom base URI for ``$ref``
265+
resolution.

src/check_jsonschema/cli/main_command.py

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,14 @@ def pretty_helptext_list(values: list[str] | tuple[str, ...]) -> str:
9494
"it will be downloaded and cached locally based on mtime."
9595
),
9696
)
97+
@click.option(
98+
"--base-uri",
99+
help=(
100+
"Override the base URI for the schema. The default behavior is to "
101+
"follow the behavior specified by the JSON Schema spec, which is to "
102+
"prefer an explicit '$id' and failover to the retrieval URI."
103+
),
104+
)
97105
@click.option(
98106
"--builtin-schema",
99107
help="The name of an internal schema to use for '--schemafile'",
@@ -212,6 +220,7 @@ def main(
212220
*,
213221
schemafile: str | None,
214222
builtin_schema: str | None,
223+
base_uri: str | None,
215224
check_metaschema: bool,
216225
no_cache: bool,
217226
cache_filename: str | None,
@@ -230,6 +239,7 @@ def main(
230239
args = ParseResult()
231240

232241
args.set_schema(schemafile, builtin_schema, check_metaschema)
242+
args.base_uri = base_uri
233243
args.instancefiles = instancefiles
234244

235245
normalized_disable_formats: tuple[str, ...] = tuple(
@@ -264,13 +274,18 @@ def main(
264274

265275
def build_schema_loader(args: ParseResult) -> SchemaLoaderBase:
266276
if args.schema_mode == SchemaLoadingMode.metaschema:
267-
return MetaSchemaLoader()
277+
return MetaSchemaLoader(base_uri=args.base_uri)
268278
elif args.schema_mode == SchemaLoadingMode.builtin:
269279
assert args.schema_path is not None
270-
return BuiltinSchemaLoader(args.schema_path)
280+
return BuiltinSchemaLoader(args.schema_path, base_uri=args.base_uri)
271281
elif args.schema_mode == SchemaLoadingMode.filepath:
272282
assert args.schema_path is not None
273-
return SchemaLoader(args.schema_path, args.cache_filename, args.disable_cache)
283+
return SchemaLoader(
284+
args.schema_path,
285+
args.cache_filename,
286+
args.disable_cache,
287+
base_uri=args.base_uri,
288+
)
274289
else:
275290
raise NotImplementedError("no valid schema option provided")
276291

src/check_jsonschema/cli/parse_result.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ def __init__(self) -> None:
1919
# primary options: schema + instances
2020
self.schema_mode: SchemaLoadingMode = SchemaLoadingMode.filepath
2121
self.schema_path: str | None = None
22+
self.base_uri: str | None = None
2223
self.instancefiles: tuple[str, ...] = ()
2324
# cache controls
2425
self.disable_cache: bool = False

src/check_jsonschema/schema_loader/main.py

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,13 @@ def __init__(
6161
schemafile: str,
6262
cache_filename: str | None = None,
6363
disable_cache: bool = False,
64+
base_uri: str | None = None,
6465
) -> None:
6566
# record input parameters (these are not to be modified)
6667
self.schemafile = schemafile
6768
self.cache_filename = cache_filename
6869
self.disable_cache = disable_cache
70+
self.base_uri = base_uri
6971

7072
# if the schema location is a URL, which may include a file:// URL, parse it
7173
self.url_info = None
@@ -104,7 +106,10 @@ def get_schema_retrieval_uri(self) -> str | None:
104106
return self.reader.get_retrieval_uri()
105107

106108
def get_schema(self) -> dict[str, t.Any]:
107-
return self.reader.read_schema()
109+
data = self.reader.read_schema()
110+
if self.base_uri is not None:
111+
data["$id"] = self.base_uri
112+
return data
108113

109114
def get_validator(
110115
self,
@@ -145,18 +150,29 @@ def get_validator(
145150

146151

147152
class BuiltinSchemaLoader(SchemaLoader):
148-
def __init__(self, schema_name: str) -> None:
153+
def __init__(self, schema_name: str, base_uri: str | None = None) -> None:
149154
self.schema_name = schema_name
155+
self.base_uri = base_uri
150156
self._parsers = ParserSet()
151157

152158
def get_schema_retrieval_uri(self) -> str | None:
153159
return None
154160

155161
def get_schema(self) -> dict[str, t.Any]:
156-
return get_builtin_schema(self.schema_name)
162+
data = get_builtin_schema(self.schema_name)
163+
if self.base_uri is not None:
164+
data["$id"] = self.base_uri
165+
return data
157166

158167

159168
class MetaSchemaLoader(SchemaLoaderBase):
169+
def __init__(self, base_uri: str | None = None) -> None:
170+
if base_uri is not None:
171+
raise NotImplementedError(
172+
"'--base-uri' was used with '--metaschema'. "
173+
"This combination is not supported."
174+
)
175+
160176
def get_validator(
161177
self,
162178
path: pathlib.Path,

tests/acceptance/test_remote_ref_resolution.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,3 +141,43 @@ def test_ref_resolution_does_not_callout_for_absolute_ref_to_retrieval_uri(
141141
assert result.exit_code == 0, output
142142
else:
143143
assert result.exit_code == 1, output
144+
145+
146+
# this test ensures that `$id` is overwritten when `--base-uri` is used
147+
@pytest.mark.parametrize("check_passes", (True, False))
148+
def test_ref_resolution_with_custom_base_uri(run_line, tmp_path, check_passes):
149+
retrieval_uri = "https://example.org/retrieval-and-in-schema-only/schemas/main"
150+
explicit_base_uri = "https://example.org/schemas/main"
151+
main_schema = {
152+
"$id": retrieval_uri,
153+
"$schema": "http://json-schema.org/draft-07/schema",
154+
"properties": {
155+
"title": {"$ref": "./title_schema.json"},
156+
},
157+
"additionalProperties": False,
158+
}
159+
title_schema = {"type": "string"}
160+
161+
responses.add("GET", retrieval_uri, json=main_schema)
162+
responses.add(
163+
"GET", "https://example.org/schemas/title_schema.json", json=title_schema
164+
)
165+
166+
instance_path = tmp_path / "instance.json"
167+
instance_path.write_text(json.dumps({"title": "doc one" if check_passes else 2}))
168+
169+
result = run_line(
170+
[
171+
"check-jsonschema",
172+
"--schemafile",
173+
retrieval_uri,
174+
"--base-uri",
175+
explicit_base_uri,
176+
str(instance_path),
177+
]
178+
)
179+
output = f"\nstdout:\n{result.stdout}\n\nstderr:\n{result.stderr}"
180+
if check_passes:
181+
assert result.exit_code == 0, output
182+
else:
183+
assert result.exit_code == 1, output

0 commit comments

Comments
 (0)