-
Notifications
You must be signed in to change notification settings - Fork 157
Configurable pattern matching semantics in response to #174 #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
8b6513b
ed1a824
6117618
5742fdf
13f3711
fe68d48
856ae1b
dd397a5
ce78377
cd54871
9386b70
048ac32
00dc3c1
91622f5
db32e46
7cccfff
8f2982d
254b890
d6384c2
b789402
e908196
dd74857
f98570c
fce68fa
40713bb
50e7ac3
5a8436d
08afc88
efb9fee
1c39c76
f8c45b4
c6b01b5
ac72e91
c139130
8cb74ae
ad333a0
3b4255f
b74e09e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| = CIP2017-01-18 - Isomorphic Matching Semantics | ||
| :numbered: | ||
| :toc: | ||
| :toc-placement: macro | ||
| :source-highlighter: codemirror | ||
|
|
||
| *Author:* Stefan Plantikow <stefan.plantikow@neotechnology.com> | ||
|
|
||
| This proposal is a response to CIR-2017-174. | ||
|
|
||
| === Proposal: Add new uniqueness modes | ||
|
|
||
| It is proposed to add the capability to select one of three uniqueness modes for a uniqueness scope: | ||
|
|
||
| * `MATCH ALL`: Impose no uniqueness requirements on candidate matches | ||
| * `MATCH UNIQUE RELATIONSHIPS`: Only consider candidate matches that are relationship-unique | ||
| * `MATCH UNIQUE NODES`: Only consider candidate matches that are node-unique | ||
|
|
||
| The default uniqueness mode used by `MATCH` (without a further specification of the preferred uniqueness mode) is relationship-unique matching. | ||
|
|
||
| `MATCH ALL` does not reject any paths - not even paths containing cycles - and hence can lead to infinite result sets for the whole query. | ||
| It is recommended that implementations generate at least a warning when static analysis is not able to proof query termination due to the chosen uniqueness mode. | ||
|
|
||
| It is conceivable that this approach for the specification of uniqueness is extensible by adding further ways to restrict uniqueness. | ||
|
|
||
| === Proposal: Specifying the uniqueness mode of a subquery | ||
|
|
||
| Changing the uniqueness mode of a sub query recursively changes the default uniqueness mode for all contained `MATCH` clauses unless it is overridden again. Examples: | ||
|
|
||
| * `MATCH <uniqueness-modes> { MATCH ... } ...` | ||
| * `DO <uniqueness-modes> { MATCH ... } ...` | ||
|
||
|
|
||
| === Proposal: Default uniqueness mode | ||
|
|
||
| Additionally, it is proposed that a conforming implementation should provide a pre-parser option for defining a default uniqueness level for use with regular pattern matching. | ||
|
||
|
|
||
| * `unique=nodes` for configuring node-uniqueness as the default for `MATCH` | ||
| * `unique=relationships` for configuring relationship-uniqueness as the default for `MATCH` | ||
|
|
||
| === Proposal: Path classes | ||
|
|
||
| Graph theory has defined various classes of paths. | ||
| Cypher so far only supports a single notion of path. | ||
|
|
||
| To improve expressivity and to help preventing the generation of infinite result sets when working with non-unique matches, it is proposed to introduce additional predicates for testing paths: | ||
|
|
||
| * `open(p)`: true if the start and the end node of `p` are not the same node | ||
| * `closed(p)`: true if the start and the end node of `p` are the same node | ||
| * `trail(p)`: true if `p` contains no duplicate relationships | ||
| * `simple(p)`: true if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node | ||
| * `trek(p)`: true if `p` contains two identical consecutive relationships | ||
|
||
| * `repetetive(p)`: true if `p` contains any closed subpath `q` of `size > 1` that is immediately repeated after itself in `p` | ||
|
||
|
|
||
| Using `repetetive` allows ensuring variable length path matching under no-uniqueness yields a finite result set: | ||
|
|
||
| [source, Cypher] | ||
| ---- | ||
| MATCH ALL p=(a)-[*]->(b), (b)-[*2..4]->(c) WHERE NOT repetetive(p) | ||
| RETURN p | ||
| ---- | ||
|
|
||
| Note that these functions naturally extend to lists. | ||
|
||
|
|
||
| Path predicates may be used to further restrict which paths are enumerated by pattern matching. | ||
| All uniqueness modes naturally correspond to default path classes: | ||
|
|
||
| * Non-uniqueness implies no restrictions on the path class. | ||
| * Relationship-uniqueness implies that all matched paths are trails. | ||
| * Node-uniqueness implies that all matched paths are simple paths. | ||
|
|
||
| == Benefits to this proposal | ||
|
|
||
| Cypher is able to express more general classes of patterns. | ||
|
|
||
| == Caveats to this proposal | ||
|
|
||
| Non-uniqueness allows for non-terminating queries. | ||
|
|
||
| A moderate increase in language complexity. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proof -> prove