Skip to content

Commit 1af1f23

Browse files
committed
New Security Considerations
1 parent 27dc557 commit 1af1f23

File tree

1 file changed

+123
-26
lines changed

1 file changed

+123
-26
lines changed

specs/jsonschema-core.md

Lines changed: 123 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2029,32 +2029,129 @@ SHOULD use the terms defined by this document to do so.
20292029

20302030
## Security Considerations {#security}
20312031

2032-
Both schemas and instances are JSON values. As such, all security considerations
2033-
defined in [RFC 8259][rfc8259] apply.
2034-
2035-
Instances and schemas are both frequently written by untrusted third parties, to
2036-
be deployed on public Internet servers. Implementations should take care that
2037-
the parsing and evaluating against schemas does not consume excessive system
2038-
resources. Implementations MUST NOT fall into an infinite loop.
2039-
2040-
A malicious party could cause an implementation to repeatedly collect a copy of
2041-
a very large value as an annotation. Implementations SHOULD guard against
2042-
excessive consumption of system resources in such a scenario.
2043-
2044-
Servers MUST ensure that malicious parties cannot change the functionality of
2045-
existing schemas by uploading a schema with a pre-existing or very similar
2046-
`$id`.
2047-
2048-
Individual JSON Schema extensions are liable to also have their own security
2049-
considerations. Consult the respective specifications for more information.
2050-
2051-
Schema authors should take care with `$comment` contents, as a malicious
2052-
implementation can display them to end-users in violation of a spec, or fail to
2053-
strip them if such behavior is expected.
2054-
2055-
A malicious schema author could place executable code or other dangerous
2056-
material within a `$comment`. Implementations MUST NOT parse or otherwise take
2057-
action based on `$comment` contents.
2032+
While schemas and instances are not always represented as JSON text, they are
2033+
defined in terms of the JSON data model. As such, the security considerations
2034+
defined in [RFC 8259][rfc8259] may still apply in environments where text-based
2035+
representations are used, particularly those considerations related to parsing,
2036+
number precision, and structural limitations.
2037+
2038+
Schemas and instances are frequently authored by untrusted parties.
2039+
Implementations that accept or evaluate such inputs may be exposed to several
2040+
classes of attack, particularly denial-of-service (DoS) by means of resource
2041+
exhaustion.
2042+
2043+
### Nested `anyOf`/`oneOf`
2044+
2045+
One risk for resource exhaustion in JSON Schema arises from the nested use of
2046+
`anyOf` and `oneOf`. While a single combinator keyword with multiple subschemas
2047+
is typically manageable, nesting them causes the number of evaluation paths to
2048+
grow exponentially.
2049+
2050+
For example, a `oneOf` with 5 subschemas, each containing another `oneOf` with 5
2051+
options, results in 25 evaluation paths. Adding a third level increases this to
2052+
125, and so on. Attackers can exploit this by crafting schemas that force
2053+
validators to explore a large number of branches.
2054+
2055+
This evaluation explosion is particularly dangerous when each path involves
2056+
expensive work such as collecting large annotations or evaluating complex
2057+
regular expressions. These effects multiply across paths and can result in
2058+
excessive CPU or memory consumption, leading to denial-of-service.
2059+
2060+
Implementations that evaluate untrusted schema are encouraged to take steps to
2061+
mitigate these threats with measures such as bounding combinator keyword depth
2062+
and breadth, limiting memory used for annotation collection, and guarding
2063+
against resource-intensive validations such as pathological regexes.
2064+
2065+
### Dynamic References
2066+
2067+
The paper ["The Complexity of JSON Schema: Undecidable, Expensive, Yet
2068+
Tractable" (Caroni et al., 2024)](https://doi.org/10.1145/3632891) has shown
2069+
that validation in the presence of dynamic references is PSPACE-complete. The
2070+
paper describes a method for replacing dynamic references with static ones, but
2071+
doing so can cause the size of the schema to grow exponentially. Implementations
2072+
should be aware of this risk and may wish to implement the method described in
2073+
the paper or impose limits on dynamic reference resolution.
2074+
2075+
### Infinite Loops and Cycles
2076+
2077+
Infinite loops can occur when evaluating schemas that produce cycles during
2078+
reference resolution. These cycles may involve multiple schemas. Not all
2079+
recursive schemas create loops, but implementations are advised to detect and
2080+
break these cycles when they are encountered.
2081+
2082+
### Schema Identity and Collisions
2083+
2084+
Schemas may declare an `$id` to identify themselves or have embedded schemas
2085+
that declare an `$id`. An attacker may attempt to register a schema with an
2086+
`$id` that collides with a previously registered schema, or that differs only by
2087+
case, encoding, or other URI normalization quirks. Such collisions could result
2088+
in overwriting or shadowing of trusted schemas.
2089+
2090+
Implementations should consider rejecting schemas that have identifiers
2091+
(including embedded schema identifiers) that conflict with registered schemas
2092+
and should apply consistent URI normalization and comparison logic to detect and
2093+
prevent conflicts.
2094+
2095+
### External Schema Resolution
2096+
2097+
JSON Schema implementations are expected to resolve external references using a
2098+
local registry. Although the specification allows for dynamic retrieval
2099+
(`https:` to fetch schemas over HTTP, or `file:` to read schemas from disk),
2100+
this behavior is discouraged unless it's intrinsic to the use case, such as with
2101+
JSON Hyper-Schema.
2102+
2103+
Resolving schemas dynamically introduces several security concerns, each of
2104+
which can be mitigated by limiting or controlling resolution behavior. A tightly
2105+
scoped schema resolution policy significantly reduces the attack surface,
2106+
especially when validating untrusted data.
2107+
2108+
Implementations are advised to disable dynamic retrieval by default and limit
2109+
external schema resolution to the local registry unless dynamic retrieval is
2110+
explicitly enabled. If enabled, they should consider limiting the number of
2111+
dynamic retrievals a validation can perform and defining timeouts on dynamic
2112+
retrievals to reduce the risk of resource exhaustion.
2113+
2114+
#### HTTP(S) Specific Threats
2115+
2116+
Allowing schema references to resolve over HTTP or HTTPS introduces several
2117+
threats:
2118+
2119+
* **Denial of Service (DoS)**: Validation may hang or become slow if a
2120+
referenced schema URL is slow to respond or never returns.
2121+
* **Server-Side Request Forgery (SSRF)**: Malicious schemas can reference
2122+
internal-only services using hostnames like localhost or private IPs.
2123+
Implementations are advised to restrict HTTP schema retrieval to a
2124+
configurable allowlist of trusted domains.
2125+
* **Lack of Integrity Guarantees**: Retrieved schemas may be altered in transit
2126+
or change between validations. If network retrieval is allowed,
2127+
implementations are advised to only allow retrieval over HTTPS unless
2128+
specifically configured to allow unsecured transport.
2129+
2130+
#### File System Specific Threats
2131+
2132+
Allowing resolution from the local filesystem (`file:` URIs) raises different
2133+
issues:
2134+
2135+
* **Information Disclosure**: Malicious schemas may access sensitive files on
2136+
the system. Implementations should consider restricting filesystem access to
2137+
a specific schema directory tree.
2138+
* **Cross-Context Access**: A schema fetched from HTTP may try to reference a
2139+
schema on the filesystem. Implementations are advised to allow resolving
2140+
`file:` references only when the referencing schema was itself loaded from the
2141+
file system, similar to same-origin policies in web browsers.
2142+
* **Exposing Internal Paths**: Schemas that use `file:` URIs may reveal
2143+
host-specific filesystem details in two ways: through the `$id` itself or
2144+
through schema locations in validation output. Implementations are advised to
2145+
reject `$id` values that use the `file:` scheme. If `file:` URIs are permitted
2146+
internally, implementations are advised to sanitize them (for example, by
2147+
converting them to relative URIs) to avoid exposing host filesystem structure
2148+
to users.
2149+
2150+
### Vocabulary-Specific Risks
2151+
2152+
Third-party JSON Schema vocabularies may introduce additional risks.
2153+
Implementers are advised to consult the specifications of any extensions they
2154+
support and take into account their security considerations as well.
20582155

20592156
## IANA Considerations
20602157

0 commit comments

Comments
 (0)