@@ -2029,32 +2029,129 @@ SHOULD use the terms defined by this document to do so.
2029
2029
2030
2030
## Security Considerations {#security}
2031
2031
2032
- Both schemas and instances are JSON values. As such, all security considerations
2033
- defined in [ RFC 8259] [ rfc8259 ] apply.
2034
-
2035
- Instances and schemas are both frequently written by untrusted third parties, to
2036
- be deployed on public Internet servers. Implementations should take care that
2037
- the parsing and evaluating against schemas does not consume excessive system
2038
- resources. Implementations MUST NOT fall into an infinite loop.
2039
-
2040
- A malicious party could cause an implementation to repeatedly collect a copy of
2041
- a very large value as an annotation. Implementations SHOULD guard against
2042
- excessive consumption of system resources in such a scenario.
2043
-
2044
- Servers MUST ensure that malicious parties cannot change the functionality of
2045
- existing schemas by uploading a schema with a pre-existing or very similar
2046
- ` $id ` .
2047
-
2048
- Individual JSON Schema extensions are liable to also have their own security
2049
- considerations. Consult the respective specifications for more information.
2050
-
2051
- Schema authors should take care with ` $comment ` contents, as a malicious
2052
- implementation can display them to end-users in violation of a spec, or fail to
2053
- strip them if such behavior is expected.
2054
-
2055
- A malicious schema author could place executable code or other dangerous
2056
- material within a ` $comment ` . Implementations MUST NOT parse or otherwise take
2057
- action based on ` $comment ` contents.
2032
+ While schemas and instances are not always represented as JSON text, they are
2033
+ defined in terms of the JSON data model. As such, the security considerations
2034
+ defined in [ RFC 8259] [ rfc8259 ] may still apply in environments where text-based
2035
+ representations are used, particularly those considerations related to parsing,
2036
+ number precision, and structural limitations.
2037
+
2038
+ Schemas and instances are frequently authored by untrusted parties.
2039
+ Implementations that accept or evaluate such inputs may be exposed to several
2040
+ classes of attack, particularly denial-of-service (DoS) by means of resource
2041
+ exhaustion.
2042
+
2043
+ ### Nested ` anyOf ` /` oneOf `
2044
+
2045
+ One risk for resource exhaustion in JSON Schema arises from the nested use of
2046
+ ` anyOf ` and ` oneOf ` . While a single combinator keyword with multiple subschemas
2047
+ is typically manageable, nesting them causes the number of evaluation paths to
2048
+ grow exponentially.
2049
+
2050
+ For example, a ` oneOf ` with 5 subschemas, each containing another ` oneOf ` with 5
2051
+ options, results in 25 evaluation paths. Adding a third level increases this to
2052
+ 125, and so on. Attackers can exploit this by crafting schemas that force
2053
+ validators to explore a large number of branches.
2054
+
2055
+ This evaluation explosion is particularly dangerous when each path involves
2056
+ expensive work such as collecting large annotations or evaluating complex
2057
+ regular expressions. These effects multiply across paths and can result in
2058
+ excessive CPU or memory consumption, leading to denial-of-service.
2059
+
2060
+ Implementations that evaluate untrusted schema are encouraged to take steps to
2061
+ mitigate these threats with measures such as bounding combinator keyword depth
2062
+ and breadth, limiting memory used for annotation collection, and guarding
2063
+ against resource-intensive validations such as pathological regexes.
2064
+
2065
+ ### Dynamic References
2066
+
2067
+ The paper [ "The Complexity of JSON Schema: Undecidable, Expensive, Yet
2068
+ Tractable" (Caroni et al., 2024)] ( https://doi.org/10.1145/3632891 ) has shown
2069
+ that validation in the presence of dynamic references is PSPACE-complete. The
2070
+ paper describes a method for replacing dynamic references with static ones, but
2071
+ doing so can cause the size of the schema to grow exponentially. Implementations
2072
+ should be aware of this risk and may wish to implement the method described in
2073
+ the paper or impose limits on dynamic reference resolution.
2074
+
2075
+ ### Infinite Loops and Cycles
2076
+
2077
+ Infinite loops can occur when evaluating schemas that produce cycles during
2078
+ reference resolution. These cycles may involve multiple schemas. Not all
2079
+ recursive schemas create loops, but implementations are advised to detect and
2080
+ break these cycles when they are encountered.
2081
+
2082
+ ### Schema Identity and Collisions
2083
+
2084
+ Schemas may declare an ` $id ` to identify themselves or have embedded schemas
2085
+ that declare an ` $id ` . An attacker may attempt to register a schema with an
2086
+ ` $id ` that collides with a previously registered schema, or that differs only by
2087
+ case, encoding, or other URI normalization quirks. Such collisions could result
2088
+ in overwriting or shadowing of trusted schemas.
2089
+
2090
+ Implementations should consider rejecting schemas that have identifiers
2091
+ (including embedded schema identifiers) that conflict with registered schemas
2092
+ and should apply consistent URI normalization and comparison logic to detect and
2093
+ prevent conflicts.
2094
+
2095
+ ### External Schema Resolution
2096
+
2097
+ JSON Schema implementations are expected to resolve external references using a
2098
+ local registry. Although the specification allows for dynamic retrieval
2099
+ (` https: ` to fetch schemas over HTTP, or ` file: ` to read schemas from disk),
2100
+ this behavior is discouraged unless it's intrinsic to the use case, such as with
2101
+ JSON Hyper-Schema.
2102
+
2103
+ Resolving schemas dynamically introduces several security concerns, each of
2104
+ which can be mitigated by limiting or controlling resolution behavior. A tightly
2105
+ scoped schema resolution policy significantly reduces the attack surface,
2106
+ especially when validating untrusted data.
2107
+
2108
+ Implementations are advised to disable dynamic retrieval by default and limit
2109
+ external schema resolution to the local registry unless dynamic retrieval is
2110
+ explicitly enabled. If enabled, they should consider limiting the number of
2111
+ dynamic retrievals a validation can perform and defining timeouts on dynamic
2112
+ retrievals to reduce the risk of resource exhaustion.
2113
+
2114
+ #### HTTP(S) Specific Threats
2115
+
2116
+ Allowing schema references to resolve over HTTP or HTTPS introduces several
2117
+ threats:
2118
+
2119
+ * ** Denial of Service (DoS)** : Validation may hang or become slow if a
2120
+ referenced schema URL is slow to respond or never returns.
2121
+ * ** Server-Side Request Forgery (SSRF)** : Malicious schemas can reference
2122
+ internal-only services using hostnames like localhost or private IPs.
2123
+ Implementations are advised to restrict HTTP schema retrieval to a
2124
+ configurable allowlist of trusted domains.
2125
+ * ** Lack of Integrity Guarantees** : Retrieved schemas may be altered in transit
2126
+ or change between validations. If network retrieval is allowed,
2127
+ implementations are advised to only allow retrieval over HTTPS unless
2128
+ specifically configured to allow unsecured transport.
2129
+
2130
+ #### File System Specific Threats
2131
+
2132
+ Allowing resolution from the local filesystem (` file: ` URIs) raises different
2133
+ issues:
2134
+
2135
+ * ** Information Disclosure** : Malicious schemas may access sensitive files on
2136
+ the system. Implementations should consider restricting filesystem access to
2137
+ a specific schema directory tree.
2138
+ * ** Cross-Context Access** : A schema fetched from HTTP may try to reference a
2139
+ schema on the filesystem. Implementations are advised to allow resolving
2140
+ ` file: ` references only when the referencing schema was itself loaded from the
2141
+ file system, similar to same-origin policies in web browsers.
2142
+ * ** Exposing Internal Paths** : Schemas that use ` file: ` URIs may reveal
2143
+ host-specific filesystem details in two ways: through the ` $id ` itself or
2144
+ through schema locations in validation output. Implementations are advised to
2145
+ reject ` $id ` values that use the ` file: ` scheme. If ` file: ` URIs are permitted
2146
+ internally, implementations are advised to sanitize them (for example, by
2147
+ converting them to relative URIs) to avoid exposing host filesystem structure
2148
+ to users.
2149
+
2150
+ ### Vocabulary-Specific Risks
2151
+
2152
+ Third-party JSON Schema vocabularies may introduce additional risks.
2153
+ Implementers are advised to consult the specifications of any extensions they
2154
+ support and take into account their security considerations as well.
2058
2155
2059
2156
## IANA Considerations
2060
2157
0 commit comments