-
Notifications
You must be signed in to change notification settings - Fork 25
Add syntax tests for codepoint escaping. #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
If you create the branch for the PR in the rdf-tests repo, the automatic report generation should work properly. It's conceivable that there is a different package that allows pushing the changes to a remote repo, or some filter that would prevent running that action if the repo is not local. |
Turtle handles Unicode escape sequences differently - it has The fact that obfuscated queries can be written in SPARQL is not good. And it is bad for streaming (SPARQL Update more than SPARQL Query). The text 19.2 Codepoint Escape Sequences isn't precise how replacement happens. These are errata that need to be addressed in the spec.. We could split tests into two: "what we want", that is good practice (to be agreed), and "full spec". Surveying existing systems:
|
f8013f6
to
b55db9e
Compare
I've updated the PR so that it only contains tests that I think align with agreed upon spec text. Despite what I think is agreement that we shouldn't support things like the obfuscated
|
@prefix dawgt: <http://www.w3.org/2001/sw/DataAccess/tests/test-dawg#> . | ||
|
||
:manifest rdf:type mf:Manifest ; | ||
rdfs:label "SPARQL Codepoint-Escaping Tests" ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should indicate that these are related to the character stream processing. Calling them "escaping" is too general.
Suggestion:
- Rename
syntax-escaping/
assyntax-char-stream-processing/
rdfs:label "Character Stream Processing Tests" ;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afs I can rename as requested, but unsure about the exact naming (and rdfs:label) here. "Codepoint Escape" is a term used in the spec, but I don't think we use anything similar to "Character Stream Processing". I'm obviously biased as the author here, but I think without context I'd be confused by what "Character Stream Processing Tests" were, but would have a pretty good idea about "Codepoint-Escaping Tests". Thoughts?
Adds new tests for some interesting cases of unicode codepoint escaping, addressing w3c/sparql-query#164. Two tests (codepoint-esc-01 and codepoint-esc-10) are marked in the manifest with
TODO
markers as being dependent on decisions on how systems should handle invalid escape sequences. I believe the others are accurately test the existing spec text of SPARQL 1.1.I think many of these cases should also be turned into evaluation tests, to ensure the unescaping is being performed correctly, but I'll leave that for another PR (or a subsequent update to this PR).