Skip to content

Commit daabb88

Browse files
authored
Merge pull request Tencent#1068 from yurikhan/violationDetails
Schema violation details
2 parents 7641af6 + b1e556d commit daabb88

File tree

4 files changed

+1576
-207
lines changed

4 files changed

+1576
-207
lines changed

doc/schema.md

Lines changed: 274 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ RapidJSON implemented a JSON Schema validator for [JSON Schema Draft v4](http://
88

99
[TOC]
1010

11-
# Basic Usage {#BasicUsage}
11+
# Basic Usage {#Basic}
1212

1313
First of all, you need to parse a JSON Schema into `Document`, and then compile the `Document` into a `SchemaDocument`.
1414

@@ -52,11 +52,11 @@ Some notes:
5252
* One `SchemaDocument` can be referenced by multiple `SchemaValidator`s. It will not be modified by `SchemaValidator`s.
5353
* A `SchemaValidator` may be reused to validate multiple documents. To run it for other documents, call `validator.Reset()` first.
5454
55-
# Validation during parsing/serialization {#ParsingSerialization}
55+
# Validation during parsing/serialization {#Fused}
5656
5757
Unlike most JSON Schema validator implementations, RapidJSON provides a SAX-based schema validator. Therefore, you can parse a JSON from a stream while validating it on the fly. If the validator encounters a JSON value that invalidates the supplied schema, the parsing will be terminated immediately. This design is especially useful for parsing large JSON files.
5858
59-
## DOM parsing {#DomParsing}
59+
## DOM parsing {#DOM}
6060
6161
For using DOM in parsing, `Document` needs some preparation and finalizing tasks, in addition to receiving SAX events, thus it needs some work to route the reader, validator and the document. `SchemaValidatingReader` is a helper class that doing such work.
6262
@@ -97,7 +97,7 @@ if (!reader.GetParseResult()) {
9797
}
9898
~~~
9999

100-
## SAX parsing {#SaxParsing}
100+
## SAX parsing {#SAX}
101101

102102
For using SAX in parsing, it is much simpler. If it only need to validate the JSON without further processing, it is simply:
103103

@@ -144,7 +144,7 @@ if (!d.Accept(validator)) {
144144

145145
Of course, if your application only needs SAX-style serialization, it can simply send SAX events to `SchemaValidator` instead of `Writer`.
146146

147-
# Remote Schema {#RemoteSchema}
147+
# Remote Schema {#Remote}
148148

149149
JSON Schema supports [`$ref` keyword](http://spacetelescope.github.io/understanding-json-schema/structuring.html), which is a [JSON pointer](doc/pointer.md) referencing to a local or remote schema. Local pointer is prefixed with `#`, while remote pointer is an relative or absolute URI. For example:
150150

@@ -176,7 +176,7 @@ The failed test is "changed scope ref invalid" of "change resolution scope" in `
176176

177177
Besides, the `format` schema keyword for string values is ignored, since it is not required by the specification.
178178

179-
## Regular Expression {#RegEx}
179+
## Regular Expression {#Regex}
180180

181181
The schema keyword `pattern` and `patternProperties` uses regular expression to match the required pattern.
182182

@@ -235,3 +235,271 @@ On a Mac Book Pro (2.8 GHz Intel Core i7), the following results are collected.
235235
|[`jayschema`](https://github.com/natesilva/jayschema)|0.1%|21 (± 1.14%)|
236236

237237
That is, RapidJSON is about 1.5x faster than the fastest JavaScript library (ajv). And 1400x faster than the slowest one.
238+
239+
# Schema violation reporting {#Reporting}
240+
241+
(Unreleased as of 2017-09-20)
242+
243+
When validating an instance against a JSON Schema,
244+
it is often desirable to report not only whether the instance is valid,
245+
but also the ways in which it violates the schema.
246+
247+
The `SchemaValidator` class
248+
collects errors encountered during validation
249+
into a JSON `Value`.
250+
This error object can then be accessed as `validator.GetError()`.
251+
252+
The structure of the error object is subject to change
253+
in future versions of RapidJSON,
254+
as there is no standard schema for violations.
255+
The details below this point are provisional only.
256+
257+
## General provisions {#ReportingGeneral}
258+
259+
Validation of an instance value against a schema
260+
produces an error value.
261+
The error value is always an object.
262+
An empty object `{}` indicates the instance is valid.
263+
264+
* The name of each member
265+
corresponds to the JSON Schema keyword that is violated.
266+
* The value is either an object describing a single violation,
267+
or an array of such objects.
268+
269+
Each violation object contains two string-valued members
270+
named `instanceRef` and `schemaRef`.
271+
`instanceRef` contains the URI fragment serialization
272+
of a JSON Pointer to the instance subobject
273+
in which the violation was detected.
274+
`schemaRef` contains the URI of the schema
275+
and the fragment serialization of a JSON Pointer
276+
to the subschema that was violated.
277+
278+
Individual violation objects can contain other keyword-specific members.
279+
These are detailed further.
280+
281+
For example, validating this instance:
282+
283+
~~~json
284+
{"numbers": [1, 2, "3", 4, 5]}
285+
~~~
286+
287+
against this schema:
288+
289+
~~~json
290+
{
291+
"type": "object",
292+
"properties": {
293+
"numbers": {"$ref": "numbers.schema.json"}
294+
}
295+
}
296+
~~~
297+
298+
where `numbers.schema.json` refers
299+
(via a suitable `IRemoteSchemaDocumentProvider`)
300+
to this schema:
301+
302+
~~~json
303+
{
304+
"type": "array",
305+
"items": {"type": "number"}
306+
}
307+
~~~
308+
309+
produces the following error object:
310+
311+
~~~json
312+
{
313+
"type": {
314+
"instanceRef": "#/numbers/2",
315+
"schemaRef": "numbers.schema.json#/items",
316+
"expected": ["number"],
317+
"actual": "string"
318+
}
319+
}
320+
~~~
321+
322+
## Validation keywords for numbers {#Numbers}
323+
324+
### multipleOf {#multipleof}
325+
326+
* `expected`: required number strictly greater than 0.
327+
The value of the `multipleOf` keyword specified in the schema.
328+
* `actual`: required number.
329+
The instance value.
330+
331+
### maximum {#maximum}
332+
333+
* `expected`: required number.
334+
The value of the `maximum` keyword specified in the schema.
335+
* `exclusiveMaximum`: optional boolean.
336+
This will be true if the schema specified `"exclusiveMaximum": true`,
337+
and will be omitted otherwise.
338+
* `actual`: required number.
339+
The instance value.
340+
341+
### minimum {#minimum}
342+
343+
* `expected`: required number.
344+
The value of the `minimum` keyword specified in the schema.
345+
* `exclusiveMinimum`: optional boolean.
346+
This will be true if the schema specified `"exclusiveMinimum": true`,
347+
and will be omitted otherwise.
348+
* `actual`: required number.
349+
The instance value.
350+
351+
## Validation keywords for strings {#Strings}
352+
353+
### maxLength {#maxLength}
354+
355+
* `expected`: required number greater than or equal to 0.
356+
The value of the `maxLength` keyword specified in the schema.
357+
* `actual`: required string.
358+
The instance value.
359+
360+
### minLength {#minLength}
361+
362+
* `expected`: required number greater than or equal to 0.
363+
The value of the `minLength` keyword specified in the schema.
364+
* `actual`: required string.
365+
The instance value.
366+
367+
### pattern {#pattern}
368+
369+
* `actual`: required string.
370+
The instance value.
371+
372+
(The expected pattern is not reported
373+
because the internal representation in `SchemaDocument`
374+
does not store the pattern in original string form.)
375+
376+
## Validation keywords for arrays {#Arrays}
377+
378+
### additionalItems {#additionalItems}
379+
380+
This keyword is reported
381+
when the value of `items` schema keyword is an array,
382+
the value of `additionalItems` is `false`,
383+
and the instance is an array
384+
with more items than specified in the `items` array.
385+
386+
* `disallowed`: required integer greater than or equal to 0.
387+
The index of the first item that has no corresponding schema.
388+
389+
### maxItems and minItems {#maxItems-minItems}
390+
391+
* `expected`: required integer greater than or equal to 0.
392+
The value of `maxItems` (respectively, `minItems`)
393+
specified in the schema.
394+
* `actual`: required integer greater than or equal to 0.
395+
Number of items in the instance array.
396+
397+
### uniqueItems {#uniqueItems}
398+
399+
* `duplicates`: required array
400+
whose items are integers greater than or equal to 0.
401+
Indices of items of the instance that are equal.
402+
403+
(RapidJSON only reports the first two equal items,
404+
for performance reasons.)
405+
406+
## Validation keywords for objects
407+
408+
### maxProperties and minProperties {#maxProperties-minProperties}
409+
410+
* `expected`: required integer greater than or equal to 0.
411+
The value of `maxProperties` (respectively, `minProperties`)
412+
specified in the schema.
413+
* `actual`: required integer greater than or equal to 0.
414+
Number of properties in the instance object.
415+
416+
### required {#required}
417+
418+
* `missing`: required array of one or more unique strings.
419+
The names of properties
420+
that are listed in the value of the `required` schema keyword
421+
but not present in the instance object.
422+
423+
### additionalProperties {#additionalProperties}
424+
425+
This keyword is reported
426+
when the schema specifies `additionalProperties: false`
427+
and the name of a property of the instance is
428+
neither listed in the `properties` keyword
429+
nor matches any regular expression in the `patternProperties` keyword.
430+
431+
* `disallowed`: required string.
432+
Name of the offending property of the instance.
433+
434+
(For performance reasons,
435+
RapidJSON only reports the first such property encountered.)
436+
437+
### dependencies {#dependencies}
438+
439+
* `errors`: required object with one or more properties.
440+
Names and values of its properties are described below.
441+
442+
Recall that JSON Schema Draft 04 supports
443+
*schema dependencies*,
444+
where presence of a named *controlling* property
445+
requires the instance object to be valid against a subschema,
446+
and *property dependencies*,
447+
where presence of a controlling property
448+
requires other *dependent* properties to be also present.
449+
450+
For a violated schema dependency,
451+
`errors` will contain a property
452+
with the name of the controlling property
453+
and its value will be the error object
454+
produced by validating the instance object
455+
against the dependent schema.
456+
457+
For a violated property dependency,
458+
`errors` will contain a property
459+
with the name of the controlling property
460+
and its value will be an array of one or more unique strings
461+
listing the missing dependent properties.
462+
463+
## Validation keywords for any instance type {#AnyTypes}
464+
465+
### enum {#enum}
466+
467+
This keyword has no additional properties
468+
beyond `instanceRef` and `schemaRef`.
469+
470+
* The allowed values are not listed
471+
because `SchemaDocument` does not store them in original form.
472+
* The violating value is not reported
473+
because it might be unwieldy.
474+
475+
If you need to report these details to your users,
476+
you can access the necessary information
477+
by following `instanceRef` and `schemaRef`.
478+
479+
### type {#type}
480+
481+
* `expected`: required array of one or more unique strings,
482+
each of which is one of the seven primitive types
483+
defined by the JSON Schema Draft 04 Core specification.
484+
Lists the types allowed by the `type` schema keyword.
485+
* `actual`: required string, also one of seven primitive types.
486+
The primitive type of the instance.
487+
488+
### allOf, anyOf, and oneOf {#allOf-anyOf-oneOf}
489+
490+
* `errors`: required array of at least one object.
491+
There will be as many items as there are subschemas
492+
in the `allOf`, `anyOf` or `oneOf` schema keyword, respectively.
493+
Each item will be the error value
494+
produced by validating the instance
495+
against the corresponding subschema.
496+
497+
For `allOf`, at least one error value will be non-empty.
498+
For `anyOf`, all error values will be non-empty.
499+
For `oneOf`, either all error values will be non-empty,
500+
or more than one will be empty.
501+
502+
### not {#not}
503+
504+
This keyword has no additional properties
505+
apart from `instanceRef` and `schemaRef`.

example/schemavalidator/schemavalidator.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
#include "rapidjson/filereadstream.h"
77
#include "rapidjson/schema.h"
88
#include "rapidjson/stringbuffer.h"
9+
#include "rapidjson/prettywriter.h"
910

1011
using namespace rapidjson;
1112

@@ -67,6 +68,11 @@ int main(int argc, char *argv[]) {
6768
sb.Clear();
6869
validator.GetInvalidDocumentPointer().StringifyUriFragment(sb);
6970
fprintf(stderr, "Invalid document: %s\n", sb.GetString());
71+
// Detailed violation report is available as a JSON value
72+
sb.Clear();
73+
PrettyWriter<StringBuffer> w(sb);
74+
validator.GetError().Accept(w);
75+
fprintf(stderr, "Error report:\n%s\n", sb.GetString());
7076
return EXIT_FAILURE;
7177
}
7278
}

0 commit comments

Comments
 (0)