Skip to content

Commit 9574916

Browse files
authored
feat(parse): introduce advanced parsing (#20)
BREAKING CHANGE: ParseResult shape has changed
1 parent d81cbee commit 9574916

File tree

23 files changed

+518
-265
lines changed

23 files changed

+518
-265
lines changed

README.md

Lines changed: 73 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,12 @@
2929
- [Installation](#installation)
3030
- [Usage](#usage)
3131
- [Parsing](#parsing)
32+
- [Translators](#translators)
33+
- [CST](#cst-translator)
34+
- [AST](#ast-translator)
35+
- [XML](#xml-translator)
36+
- [Statistics](#statistics)
37+
- [Tracing](#tracing)
3238
- [Validation](#validation)
3339
- [Escaping](#escaping)
3440
- [Evaluation](#evaluation)
@@ -81,95 +87,96 @@ const parseResult = parse('/foo/bar');
8187

8288
```
8389
{
84-
result: {
85-
success: true,
86-
state: 101,
87-
stateName: 'MATCH',
88-
length: 8,
89-
matched: 8,
90-
maxMatched: 8,
91-
maxTreeDepth: 8,
92-
nodeHits: 49
93-
},
94-
ast: fnast {
95-
callbacks: [
96-
'json-pointer': [Function: jsonPointer],
97-
'reference-token': [Function: referenceToken]
98-
],
99-
init: [Function (anonymous)],
100-
ruleDefined: [Function (anonymous)],
101-
udtDefined: [Function (anonymous)],
102-
down: [Function (anonymous)],
103-
up: [Function (anonymous)],
104-
translate: [Function (anonymous)],
105-
setLength: [Function (anonymous)],
106-
getLength: [Function (anonymous)],
107-
toXml: [Function (anonymous)]
108-
},
109-
computed: [ 'foo', 'bar' ]
90+
result: <ParseResult['result]>,
91+
tree: <ParseResult['tree']>,
92+
stats: <ParseResult['stats']>,
93+
trace: <ParseResult['trace']>,
11094
}
11195
```
11296

113-
###### Evaluating AST as list of unescaped reference tokens
97+
[TypeScript typings](https://github.com/swaggerexpert/json-pointer/blob/main/types/index.d.ts) are available for all fields attached to parse result object returned by the `parse` function.
98+
99+
##### Translators
100+
101+
`@swaggerexpert/json-pointer` provides several translators to convert the parse result into different tree representations.
114102

115-
One of the ways to interpret the parsed JSON Pointer is to evaluate it as a list of unescaped reference tokens.
103+
###### CST translator
104+
105+
[Concrete Syntax Tree](https://en.wikipedia.org/wiki/Parse_tree) (Parse tree) representation is available on parse result
106+
when instance of `CSTTranslator` is provided via a `translator` option to the `parse` function.
107+
CST is suitable to be consumed by other tools like IDEs, editors, etc...
116108

117109
```js
118-
import { parse } from '@swaggerexpert/json-parse';
110+
import { parse, CSTTranslator } from '@swaggerexpert/json-pointer';
119111

120-
const { computed } = parse('/foo/bar'); // computed = ['foo', 'bar']
112+
const { tree: CST } = parse('/foo/bar', { translator: new CSTTranslator() });
121113
```
122114

123-
###### Interpreting AST as list of entries
115+
CST tree has a shape documented by [TypeScript typings (CSTTree)](https://github.com/swaggerexpert/json-pointer/blob/main/types/index.d.ts).
116+
117+
###### AST translator
118+
119+
**Default translator**. [Abstract Syntax Tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) representation is available on parse result
120+
by default or when instance of `ASTTranslator` is provided via a `translator` option to the `parse` function.
121+
AST is suitable to be consumed by implementations that need to analyze the structure of the JSON Pointer
122+
or for building a custom JSON Pointer evaluation engine.
123+
124+
AST of the parsed JSON Pointer is a list of unescaped reference tokens.
124125

125126
```js
126-
import { parse } from '@swaggerexpert/json-parse';
127+
import { parse } from '@swaggerexpert/json-pointer';
127128

128-
const parseResult = parse('/foo/bar');
129-
const parts = [];
129+
const { tree: AST } = parse('/foo/bar'); // AST = ['foo', 'bar']
130+
```
131+
132+
or
133+
134+
```js
135+
import { parse, ASTTranslator } from '@swaggerexpert/json-poiner';
130136

131-
parseResult.ast.translate(parts);
137+
const { tree: AST } = parse('/foo/bar', { translator: new ASTTranslator() }); // AST = ['foo', 'bar']
132138
```
133139

134-
After running the above code, **parts** variable has the following shape:
140+
###### XML translator
135141

136142
```js
137-
[
138-
['json-pointer', '/foo/bar'],
139-
['reference-token', 'foo'],
140-
['reference-token', 'bar'],
141-
]
143+
import { parse, XMLTranslator } from '@swaggerexpert/json-pointer';
144+
145+
const { tree: XML } = parse('$.store.book[0].title', { translator: new XMLTranslator() });
142146
```
143147

144-
###### Interpreting AST as XML
148+
##### Statistics
149+
150+
`parse` function returns additional statistical information about the parsing process.
151+
Collection of the statistics can be enabled by setting `stats` option to `true`.
145152

146153
```js
147154
import { parse } from '@swaggerexpert/json-pointer';
148155

149-
const parseResult = parse('/foo/bar');
150-
const xml = parseResult.ast.toXml();
156+
const { stats } = parse('/foo/bar', { stats: true });
157+
158+
stats.displayStats(); // returns operator stats
159+
stats.displayHits(); // returns rules grouped by hit count
151160
```
152161

153-
After running the above code, **xml** variable has the following content:
162+
##### Tracing
163+
164+
`parse` function returns additional tracing information about the parsing process.
165+
Tracing can be enabled by setting `trace` option to `true`. Tracing is essential
166+
for debugging failed matches or analyzing rule execution flow.
167+
168+
```js
169+
import { parse } from '@swaggerexpert/json-pointer';
154170

155-
```xml
156-
<?xml version="1.0" encoding="utf-8"?>
157-
<root nodes="3" characters="8">
158-
<!-- input string -->
159-
/foo/bar
160-
<node name="json-pointer" index="0" length="8">
161-
/foo/bar
162-
<node name="reference-token" index="1" length="3">
163-
foo
164-
</node><!-- name="reference-token" -->
165-
<node name="reference-token" index="5" length="3">
166-
bar
167-
</node><!-- name="reference-token" -->
168-
</node><!-- name="json-pointer" -->
169-
</root>
171+
const { result, trace } = parse('1', { trace: true });
172+
173+
result.success; // returns false
174+
trace.displayTrace(); // returns trace information
170175
```
171176

172-
> NOTE: AST can also be traversed in classical way using [depth first traversal](https://www.tutorialspoint.com/data_structures_algorithms/depth_first_traversal.htm). For more information about this option please refer to [apg-js](https://github.com/ldthomas/apg-js) and [apg-js-examples](https://github.com/ldthomas/apg-js-examples).
177+
By combining information from `result` and `trace`, it is possible to analyze the parsing process in detail
178+
and generate a messages like this: `'Syntax error at position 0, expected "/"'`. Please see this
179+
[test file](https://github.com/swaggerexpert/json-pointer/blob/main/test/parse/trace.js) for more information how to achieve that.
173180

174181
#### Validation
175182

@@ -409,43 +416,22 @@ Before using the ApiDOM Evaluation Realm, you need to install the `@swagger-api/
409416

410417
```js
411418
import { ObjectElement } from '@swagger-api/apidom-core';
412-
import { InfoElement } from '@swagger-api/apidom-ns-openapi-3-0'
413419
import { evaluate } from '@swaggerexpert/json-pointer';
414420
import ApiDOMEvaluationRealm from '@swaggerexpert/json-pointer/evaluate/realms/apidom';
415421

416422
const objectElement = new ObjectElement({
417423
a: ['b', 'c']
418424
});
419-
const infoElement = InfoElement.refract({
420-
contact: {
421-
name: 'SwaggerExpert',
422-
email: 'contact@swaggerexpert.com'
423-
}
424-
})
425-
426425

427426
evaluate(objectElement, '/a/1', { realm: new ApiDOMEvaluationRealm() }); // => StringElement('c')
428-
evaluate(infoElement, '/contact/name', { realm: new ApiDOMEvaluationRealm() }); // => StringElement('SwaggerExpert')
429427
```
430428

431429
###### Custom Evaluation Realms
432430

433431
The evaluation is designed to support **custom evaluation realms**,
434432
enabling JSON Pointer evaluation for **non-standard data structures**.
435433

436-
A valid custom evaluation realm must match the structure of the `EvaluationRealm` interface.
437-
438-
```ts
439-
interface EvaluationRealm {
440-
readonly name: string;
441-
442-
isArray(node: unknown): boolean;
443-
isObject(node: unknown): boolean;
444-
sizeOf(node: unknown): number;
445-
has(node: unknown, referenceToken: string): boolean;
446-
evaluate(node: unknown, referenceToken: string): unknown;
447-
}
448-
```
434+
A valid custom evaluation realm must match the structure of the [EvaluationRealm interface](https://github.com/swaggerexpert/json-pointer/blob/main/types/index.d.ts).
449435

450436
One way to create a custom realm is to extend the `EvaluationRealm` class and implement the required methods.
451437

@@ -594,7 +580,7 @@ JSON Pointer is defined by the following [ABNF](https://tools.ietf.org/html/rfc5
594580
```abnf
595581
; JavaScript Object Notation (JSON) Pointer ABNF syntax
596582
; https://datatracker.ietf.org/doc/html/rfc6901
597-
json-pointer = *( "/" reference-token )
583+
json-pointer = *( slash reference-token ) ; MODIFICATION: surrogate text rule used
598584
reference-token = *( unescaped / escaped )
599585
unescaped = %x00-2E / %x30-7D / %x7F-10FFFF
600586
; %x2F ('/') and %x7E ('~') are excluded from 'unescaped'
@@ -606,6 +592,9 @@ array-location = array-index / array-dash
606592
array-index = %x30 / ( %x31-39 *(%x30-39) )
607593
; "0", or digits without a leading "0"
608594
array-dash = "-"
595+
596+
; Surrogate named rules
597+
slash = "/"
609598
```
610599

611600
## License

src/evaluate/index.js

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,9 @@ import JSONPointerKeyError from '../errors/JSONPointerKeyError.js';
1010
const evaluate = (
1111
value,
1212
jsonPointer,
13-
{ strictArrays = true, strictObjects = true, evaluator = null, realm = new JSONRealm() } = {},
13+
{ strictArrays = true, strictObjects = true, realm = new JSONRealm() } = {},
1414
) => {
15-
const parseOptions = typeof evaluator === 'function' ? { evaluator } : undefined;
16-
const { result, computed: referenceTokens } = parse(jsonPointer, parseOptions);
15+
const { result, tree: referenceTokens } = parse(jsonPointer);
1716

1817
if (!result.success) {
1918
throw new JSONPointerEvaluateError(`Invalid JSON Pointer: ${jsonPointer}`, {

src/grammar.bnf

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
; JavaScript Object Notation (JSON) Pointer ABNF syntax
22
; https://datatracker.ietf.org/doc/html/rfc6901
3-
json-pointer = *( "/" reference-token )
3+
json-pointer = *( slash reference-token ) ; MODIFICATION: surrogate text rule used
44
reference-token = *( unescaped / escaped )
55
unescaped = %x00-2E / %x30-7D / %x7F-10FFFF
66
; %x2F ('/') and %x7E ('~') are excluded from 'unescaped'
@@ -13,4 +13,5 @@ array-index = %x30 / ( %x31-39 *(%x30-39) )
1313
; "0", or digits without a leading "0"
1414
array-dash = "-"
1515

16-
16+
; Surrogate named rules
17+
slash = "/"

src/grammar.js

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,14 @@
55
export default function grammar(){
66
// ```
77
// SUMMARY
8-
// rules = 7
8+
// rules = 8
99
// udts = 0
10-
// opcodes = 27
10+
// opcodes = 28
1111
// --- ABNF original opcodes
1212
// ALT = 5
1313
// CAT = 3
1414
// REP = 3
15-
// RNM = 5
15+
// RNM = 6
1616
// TLS = 5
1717
// TBS = 1
1818
// TRG = 5
@@ -34,6 +34,7 @@ export default function grammar(){
3434
this.rules[4] = { name: 'array-location', lower: 'array-location', index: 4, isBkr: false };
3535
this.rules[5] = { name: 'array-index', lower: 'array-index', index: 5, isBkr: false };
3636
this.rules[6] = { name: 'array-dash', lower: 'array-dash', index: 6, isBkr: false };
37+
this.rules[7] = { name: 'slash', lower: 'slash', index: 7, isBkr: false };
3738

3839
/* UDTS */
3940
this.udts = [];
@@ -43,7 +44,7 @@ export default function grammar(){
4344
this.rules[0].opcodes = [];
4445
this.rules[0].opcodes[0] = { type: 3, min: 0, max: Infinity };// REP
4546
this.rules[0].opcodes[1] = { type: 2, children: [2,3] };// CAT
46-
this.rules[0].opcodes[2] = { type: 7, string: [47] };// TLS
47+
this.rules[0].opcodes[2] = { type: 4, index: 7 };// RNM(slash)
4748
this.rules[0].opcodes[3] = { type: 4, index: 1 };// RNM(reference-token)
4849

4950
/* reference-token */
@@ -87,12 +88,16 @@ export default function grammar(){
8788
this.rules[6].opcodes = [];
8889
this.rules[6].opcodes[0] = { type: 7, string: [45] };// TLS
8990

91+
/* slash */
92+
this.rules[7].opcodes = [];
93+
this.rules[7].opcodes[0] = { type: 7, string: [47] };// TLS
94+
9095
// The `toString()` function will display the original grammar file(s) that produced these opcodes.
9196
this.toString = function toString(){
9297
let str = "";
9398
str += "; JavaScript Object Notation (JSON) Pointer ABNF syntax\n";
9499
str += "; https://datatracker.ietf.org/doc/html/rfc6901\n";
95-
str += "json-pointer = *( \"/\" reference-token )\n";
100+
str += "json-pointer = *( slash reference-token ) ; MODIFICATION: surrogate text rule used\n";
96101
str += "reference-token = *( unescaped / escaped )\n";
97102
str += "unescaped = %x00-2E / %x30-7D / %x7F-10FFFF\n";
98103
str += " ; %x2F ('/') and %x7E ('~') are excluded from 'unescaped'\n";
@@ -105,7 +110,8 @@ export default function grammar(){
105110
str += " ; \"0\", or digits without a leading \"0\"\n";
106111
str += "array-dash = \"-\"\n";
107112
str += "\n";
108-
str += "\n";
113+
str += "; Surrogate named rules\n";
114+
str += "slash = \"/\"\n";
109115
return str;
110116
}
111117
}

src/index.js

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@ export { JSONString, URIFragmentIdentifier };
44

55
export { default as Grammar } from './grammar.js';
66
export { default as parse } from './parse/index.js';
7-
export { default as referenceTokenListEvaluator } from './parse/evaluators/reference-token-list.js';
7+
export { default as CSTTranslator } from './parse/translators/CSTTranslator.js';
8+
export { default as ASTTranslator } from './parse/translators/ASTTranslator.js';
9+
export { default as XMLTranslator } from './parse/translators/XMLTranslator.js';
810

911
export { default as testJSONPointer } from './test/json-pointer.js';
1012
export { default as testReferenceToken } from './test/reference-token.js';

src/parse/callbacks/cst.js

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
import { utilities, identifiers } from 'apg-lite';
2+
3+
import JSONPointerParseError from '../../errors/JSONPointerParseError.js';
4+
5+
const cst = (ruleName) => {
6+
return (state, chars, phraseIndex, phraseLength, data) => {
7+
if (!(typeof data === 'object' && data !== null && !Array.isArray(data))) {
8+
throw new JSONPointerParseError("parser's user data must be an object");
9+
}
10+
11+
if (state === identifiers.SEM_PRE) {
12+
const node = {
13+
type: ruleName,
14+
text: utilities.charsToString(chars, phraseIndex, phraseLength),
15+
start: phraseIndex,
16+
length: phraseLength,
17+
children: [],
18+
};
19+
20+
if (data.stack.length > 0) {
21+
const parent = data.stack[data.stack.length - 1];
22+
parent.children.push(node);
23+
} else {
24+
data.root = node;
25+
}
26+
27+
data.stack.push(node);
28+
}
29+
30+
if (state === identifiers.SEM_POST) {
31+
data.stack.pop();
32+
}
33+
};
34+
};
35+
36+
export default cst;

src/parse/callbacks/json-pointer.js

Lines changed: 0 additions & 16 deletions
This file was deleted.

0 commit comments

Comments
 (0)