-
Notifications
You must be signed in to change notification settings - Fork 141
Fix #377: Implemented support for adding the tokens to parsed nodes. #388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -12,11 +12,12 @@ let _blankNodeCounter = 0; | |||||||||||||||||||
const escapedLiteral = /^"(.*".*)(?="[^"]*$)/; | ||||||||||||||||||||
|
||||||||||||||||||||
// ## DataFactory singleton | ||||||||||||||||||||
// Note: The default data factory does not set the token field of terms. | ||||||||||||||||||||
const DataFactory = { | ||||||||||||||||||||
namedNode, | ||||||||||||||||||||
blankNode, | ||||||||||||||||||||
variable, | ||||||||||||||||||||
literal, | ||||||||||||||||||||
namedNode: iri => namedNode(iri), | ||||||||||||||||||||
blankNode: name => blankNode(name), | ||||||||||||||||||||
variable: name => variable(name), | ||||||||||||||||||||
literal: (name, datatype) => literal(name, datatype), | ||||||||||||||||||||
defaultGraph, | ||||||||||||||||||||
quad, | ||||||||||||||||||||
triple: quad, | ||||||||||||||||||||
|
@@ -63,6 +64,12 @@ export class Term { | |||||||||||||||||||
|
||||||||||||||||||||
// ## NamedNode constructor | ||||||||||||||||||||
export class NamedNode extends Term { | ||||||||||||||||||||
constructor(id, token) { | ||||||||||||||||||||
super(id); | ||||||||||||||||||||
|
||||||||||||||||||||
this.token = token; | ||||||||||||||||||||
} | ||||||||||||||||||||
|
||||||||||||||||||||
|
constructor(id, token) { | |
super(id); | |
this.token = token; | |
} | |
constructor(id, token = null) { | |
super(id); | |
this.token = token; | |
} |
I'm nitting here, but I want things to be explicit. No undefined
.
However, even better would be if the Term
constructor simply took token
; then we don't need any of this.
And let's not make it token, let's make it context
, with let { token } = context
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's OK. So will the default be context = { token = null } or context = null?
I am personally hesitant to initialize token
with an empty string as this is a different datataype as the Token
object and it might indicate that one should expect a non-empty string when it is initalized which is misleading.
But I'll do as you wish. Just let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Saw the answer in the comment below. Will go with DEFAULT_CONTEXT.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to pass the token along here. I don't think we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never. Either null
or ''
, but never explicitly undefined
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, no problem.
However, I'm just curious to learn why. Am I right that you want the variable to be initialized with something so that it is visible on an API level? I have to admit that I am not that deep into JavaScript to know all the consequences. It was my reasoning that when passing undefined
the variable might not occupy any memory as interpreters might just delete it and thus having the API behave as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was my reasoning that when passing
undefined
the variable might not occupy any memory
That's not how JavaScript engines work today; the will still explicitly assign the field.
However, I'm just curious to learn why.
Mainly for explicitness. undefined
is what happens when we don't pass anything, forget stuff, uninitialized, etc. null
signifies a choice. Choices are good.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -159,19 +159,19 @@ export default class N3Parser { | |
const iri = this._resolveIRI(token.value); | ||
if (iri === null) | ||
return this._error('Invalid IRI', token); | ||
value = this._namedNode(iri); | ||
value = this._namedNode(iri, token); | ||
break; | ||
// Read a prefixed name | ||
case 'type': | ||
case 'prefixed': | ||
const prefix = this._prefixes[token.prefix]; | ||
if (prefix === undefined) | ||
return this._error(`Undefined prefix "${token.prefix}:"`, token); | ||
value = this._namedNode(prefix + token.value); | ||
value = this._namedNode(prefix + token.value, token); | ||
break; | ||
// Read a blank node | ||
case 'blank': | ||
value = this._blankNode(this._prefixes[token.prefix] + token.value); | ||
value = this._blankNode(this._prefixes[token.prefix] + token.value, token); | ||
break; | ||
// Read a variable | ||
case 'var': | ||
|
@@ -194,7 +194,7 @@ export default class N3Parser { | |
case '[': | ||
// Start a new quad with a new blank node as subject | ||
this._saveContext('blank', this._graph, | ||
this._subject = this._blankNode(), null, null); | ||
this._subject = this._blankNode(undefined, token), null, null); | ||
|
||
return this._readBlankNodeHead; | ||
case '(': | ||
// Start a new list | ||
|
@@ -206,7 +206,7 @@ export default class N3Parser { | |
if (!this._n3Mode) | ||
return this._error('Unexpected graph', token); | ||
this._saveContext('formula', this._graph, | ||
this._graph = this._blankNode(), null, null); | ||
this._graph = this._blankNode(undefined, token), null, null); | ||
|
||
return this._readSubject; | ||
case '}': | ||
// No subject; the graph in which we are reading is closed instead | ||
|
@@ -234,7 +234,7 @@ export default class N3Parser { | |
return this._completeSubjectLiteral; | ||
} | ||
else | ||
this._subject = this._literal(token.value, this._namedNode(token.prefix)); | ||
this._subject = this._literal(token.value, this._namedNode(token.prefix, token), token); | ||
|
||
break; | ||
case '<<': | ||
|
@@ -282,7 +282,7 @@ export default class N3Parser { | |
if (this._n3Mode) { | ||
// Start a new quad with a new blank node as subject | ||
this._saveContext('blank', this._graph, this._subject, | ||
this._subject = this._blankNode(), null); | ||
this._subject = this._blankNode(undefined, token), null); | ||
return this._readBlankNodeHead; | ||
} | ||
case 'blank': | ||
|
@@ -307,12 +307,12 @@ export default class N3Parser { | |
} | ||
// Pre-datatyped string literal (prefix stores the datatype) | ||
else | ||
this._object = this._literal(token.value, this._namedNode(token.prefix)); | ||
this._object = this._literal(token.value, this._namedNode(token.prefix, token), token); | ||
break; | ||
case '[': | ||
// Start a new quad with a new blank node as subject | ||
this._saveContext('blank', this._graph, this._subject, this._predicate, | ||
this._subject = this._blankNode()); | ||
this._subject = this._blankNode(undefined, token)); | ||
return this._readBlankNodeHead; | ||
case '(': | ||
// Start a new list | ||
|
@@ -324,7 +324,7 @@ export default class N3Parser { | |
if (!this._n3Mode) | ||
return this._error('Unexpected graph', token); | ||
this._saveContext('formula', this._graph, this._subject, this._predicate, | ||
this._graph = this._blankNode()); | ||
this._graph = this._blankNode(undefined, token)); | ||
return this._readSubject; | ||
case '<<': | ||
if (!this._supportsRDFStar) | ||
|
@@ -419,14 +419,14 @@ export default class N3Parser { | |
case '[': | ||
// Stack the current list quad and start a new quad with a blank node as subject | ||
this._saveContext('blank', this._graph, | ||
list = this._blankNode(), this.RDF_FIRST, | ||
this._subject = item = this._blankNode()); | ||
list = this._blankNode(undefined, token), this.RDF_FIRST, | ||
this._subject = item = this._blankNode(undefined, token)); | ||
next = this._readBlankNodeHead; | ||
break; | ||
case '(': | ||
// Stack the current list quad and start a new list | ||
this._saveContext('list', this._graph, | ||
list = this._blankNode(), this.RDF_FIRST, this.RDF_NIL); | ||
list = this._blankNode(undefined, token), this.RDF_FIRST, this.RDF_NIL); | ||
this._subject = null; | ||
break; | ||
case ')': | ||
|
@@ -462,7 +462,7 @@ export default class N3Parser { | |
} | ||
// Pre-datatyped string literal (prefix stores the datatype) | ||
else { | ||
item = this._literal(token.value, this._namedNode(token.prefix)); | ||
item = this._literal(token.value, this._namedNode(token.prefix, token), token); | ||
next = this._getContextEndReader(); | ||
} | ||
break; | ||
|
@@ -471,7 +471,7 @@ export default class N3Parser { | |
if (!this._n3Mode) | ||
return this._error('Unexpected graph', token); | ||
this._saveContext('formula', this._graph, this._subject, this._predicate, | ||
this._graph = this._blankNode()); | ||
this._graph = this._blankNode(undefined, token)); | ||
return this._readSubject; | ||
default: | ||
if ((item = this._readEntity(token)) === undefined) | ||
|
@@ -480,7 +480,7 @@ export default class N3Parser { | |
|
||
// Create a new blank node if no item head was assigned yet | ||
if (list === null) | ||
this._subject = list = this._blankNode(); | ||
this._subject = list = this._blankNode(undefined, token); | ||
|
||
// Is this the first element of the list? | ||
if (previousList === null) { | ||
|
@@ -524,20 +524,20 @@ export default class N3Parser { | |
// ### `_completeLiteral` completes a literal with an optional datatype or language | ||
_completeLiteral(token) { | ||
// Create a simple string literal by default | ||
let literal = this._literal(this._literalValue); | ||
let literal = this._literal(this._literalValue, undefined, token); | ||
|
||
switch (token.type) { | ||
// Create a datatyped literal | ||
case 'type': | ||
case 'typeIRI': | ||
const datatype = this._readEntity(token); | ||
if (datatype === undefined) return; // No datatype means an error occurred | ||
literal = this._literal(this._literalValue, datatype); | ||
literal = this._literal(this._literalValue, datatype, token); | ||
token = null; | ||
break; | ||
// Create a language-tagged string | ||
case 'langcode': | ||
literal = this._literal(this._literalValue, token.value); | ||
literal = this._literal(this._literalValue, token.value, token); | ||
token = null; | ||
break; | ||
} | ||
|
@@ -721,7 +721,7 @@ export default class N3Parser { | |
_readNamedGraphBlankLabel(token) { | ||
if (token.type !== ']') | ||
return this._error('Invalid graph label', token); | ||
this._subject = this._blankNode(); | ||
this._subject = this._blankNode(undefined, token); | ||
return this._readGraph; | ||
} | ||
|
||
|
@@ -751,17 +751,17 @@ export default class N3Parser { | |
} | ||
// Without explicit quantifiers, map entities to a quantified entity | ||
if (!this._explicitQuantifiers) | ||
this._quantified[entity.id] = this._quantifier(this._blankNode().value); | ||
this._quantified[entity.id] = this._quantifier(this._blankNode(undefined, token).value); | ||
// With explicit quantifiers, output the reified quantifier | ||
else { | ||
// If this is the first item, start a new quantifier list | ||
if (this._subject === null) | ||
this._emit(this._graph || this.DEFAULTGRAPH, this._predicate, | ||
this._subject = this._blankNode(), this.QUANTIFIERS_GRAPH); | ||
this._subject = this._blankNode(undefined, token), this.QUANTIFIERS_GRAPH); | ||
// Otherwise, continue the previous list | ||
else | ||
this._emit(this._subject, this.RDF_REST, | ||
this._subject = this._blankNode(), this.QUANTIFIERS_GRAPH); | ||
this._subject = this._blankNode(undefined, token), this.QUANTIFIERS_GRAPH); | ||
// Output the list item | ||
this._emit(this._subject, this.RDF_FIRST, entity, this.QUANTIFIERS_GRAPH); | ||
} | ||
|
@@ -818,7 +818,7 @@ export default class N3Parser { | |
// ### `_readForwardPath` reads a '!' path | ||
_readForwardPath(token) { | ||
let subject, predicate; | ||
const object = this._blankNode(); | ||
const object = this._blankNode(undefined, token); | ||
// The next token is the predicate | ||
if ((predicate = this._readEntity(token)) === undefined) | ||
return; | ||
|
@@ -835,7 +835,7 @@ export default class N3Parser { | |
|
||
// ### `_readBackwardPath` reads a '^' path | ||
_readBackwardPath(token) { | ||
const subject = this._blankNode(); | ||
const subject = this._blankNode(undefined, token); | ||
let predicate, object; | ||
// The next token is the predicate | ||
if ((predicate = this._readEntity(token)) === undefined) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy leaving these as they are actually. It doesn't hurt to have the second optional argument, and that way, we can expose it to other parsers.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand you correctly, you want to pass the default constructor functions to the DataFactory as they are. This would mean, however, that the default behavior of the DataFactory changes to always creating Terms with an initialized { token } context. Which in turn will break some tests and is probably not needed in standard situations.
I was hesitant to put a condition into the Parser to check wether to pass the token/context argument to the factory or not because this would have an impact on the performance. The solution to simply ignore the second argument in the DataFactory seemed like a good compromise and allows the API users to define their own behavior regarding the context variable of a node, such as offsetting positions or adding additional data if needed.
So should I revert this to the default functions and have the context always initialized with a token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine; objects can always have optional fields.
Those tests might be too strict; we can loosen them.
Indeed; let's definitely not.
Aah okay. This actually shows me that we need 2 DataFactory classes then. One that discards the context, and one that keeps it. Let's make both first-class citizens.