You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -30,9 +30,60 @@ While Cypher allows omitting path, node, and relationship variables in a pattern
30
30
31
31
== Proposal
32
32
33
+
This CIP has been submitted in the belief that *CIP2017-02-06 Path Pattern Queries* will be accepted and is aligned with it.
34
+
35
+
=== Deprecations
36
+
33
37
This CIP proposes to replace the notion of *uniqueness scope* and *cyphermorphism* and all associated rules by providing new, configurable pattern matching semantics for Cypher as outlined in this section.
34
38
35
-
This CIP has been submitted in the belief that *CIP2017-02-06 Path Pattern Queries* will be accepted and is aligned with it.
39
+
This CIP proposes to deprecate support for binding relationship list variables in variable length relationship patterns.
40
+
41
+
This CIP proposes to deprecate the existing syntax for both `shortestPath` and `allShortestPaths` matching of Cypher.
42
+
43
+
44
+
=== Basic pattern matching semantics
45
+
46
+
Each pattern consists of one or more top-level pattern parts that are given in a comma separated list.
47
+
48
+
[source=cypher]
49
+
----
50
+
MATCH (a)-->(b), (c)<--(d)
51
+
RETURN *
52
+
----
53
+
54
+
The solution (set of succesful matches) of a pattern is the cross product over the solutions of all it's top-level pattern parts, i.e. the above is the same as
55
+
56
+
[source=cypher]
57
+
----
58
+
MATCH (a)-->(b)
59
+
// sequence of matches acts like a cross product:name: value
60
+
// for each incoming row with a and b, find all matches (c)<--(d)
61
+
MATCH (c)<--(d)
62
+
RETURN *
63
+
----
64
+
65
+
(ignoring uniqueness).
66
+
67
+
Binding any two node patterns, relationship patterns, or path patterns that are contained in the same pattern are bound to the same pattern variable describes an implicit join, i.e.
68
+
69
+
[source=cypher]
70
+
----
71
+
MATCH (a)-->()<--(a)-->(b)
72
+
RETURN a
73
+
----
74
+
75
+
is semantically the same as
76
+
77
+
[source=cypher]
78
+
----
79
+
MATCH (n1)-->(n2), (n3)<--(n4), (n4)-->(b) WHERE n1 = n4 AND n2 = n3
80
+
RETURN v1 AS a
81
+
----
82
+
83
+
=== Pattern binders
84
+
85
+
This CIP proposes to name the path variable that occurs before a pattern element of a pattern part to *pattern binder* in the grammar.
86
+
Note that such variables are always bound to a linear sequence of node, relationship, and path patterns of its pattern element.
36
87
37
88
=== Walks
38
89
@@ -46,27 +97,28 @@ Note that every `PATH` is a `TRAIL` and that every `TRAIL` is a `WALK`.
46
97
47
98
This CIP proposes to rename the cypher type `PATH` to `WALK`.
48
99
49
-
=== Pattern binders
50
-
51
-
This CIP proposes to name the path variable that occurs before a pattern element of a pattern part to *pattern binder* in the grammar.
52
-
Note that such variables are always bound to a linear sequence of node, relationship, and path query patterns of its pattern element.
100
+
=== Pattern binder class
53
101
54
102
This CIP proposes introducing the notion of a *pattern binder class* that may be writtern before a pattern binder in a read-only pattern (i.e. a pattern that is not used as an argument to an updating clause) and restricts the set of valid pattern matches for the following pattern element.
55
-
The proposed pattern binder classes are:
103
+
The proposed pattern binder classes in both singular and plural form are:
104
+
105
+
* `WALK` (plural: `WALKS`) This pattern binder should only be bound to a `WALK` that matches all node, relationship, and path patterns given in the following pattern element.
106
+
* `TRAIL` (plural: `TRAILS`) This pattern binder should only be bound to a `TRAIL` that matches all node, relationship, and path patterns given in the following pattern element
107
+
* `PATH` (plural: `PATHS`) This pattern binder should only be bound to a simple `PATH` that matches all node, relationship, and path patterns given in the following pattern element
56
108
57
-
* `WALK` This pattern binder should only be bound to a `WALK` that matches all node, relationship, and path query patterns given in the following pattern element
58
-
* `TRAIL` This pattern binder should only be bound to a `TRAIL` that matches all node, relationship, and path query patterns given in the following pattern element
59
-
* `PATH` This pattern binder should only be bound to a simple `PATH` that matches all node, relationship, and path query patterns given in the following pattern element
109
+
This CIP proposes the default pattern binder class to be `WALK`.
60
110
61
111
The pattern binder class may be futher qualified with one of the following prefixes:
62
112
63
-
* `OPEN WALK|TRAIL|PATH` This pattern binder should only be bound to walks (or trails, or paths respectively) whose start and end nodes are _not the same node_
64
-
* `CLOSED WALK|TRAIL|PATH` This pattern binder should only be bound to walks (or trails, or paths respectively) whose start and end nodes are _the same node_
113
+
* `OPEN WALK[S]|TRAIL[S]|PATH[S]` This pattern binder should only be bound to walks (or trails, or paths respectively) whose start and end nodes are _not the same node_
114
+
* `CLOSED WALK[S]|TRAIL[S]|PATH[S]` This pattern binder should only be bound to walks (or trails, or paths respectively) whose start and end nodes are _the same node_
65
115
66
116
The following additional pattern binder classes are proposed to accomodate existing terminology that is commonly used in graph theory:
67
117
68
118
* `CIRCUIT` is a synonym for `CLOSED TRAIL`
69
119
* `CYCLE` is a synonym for `CLOSED PATH`
120
+
* `CIRCUITS` is a synonym for `CLOSED TRAILS`
121
+
* `CYCLES` is a synonym for `CLOSED PATHS`
70
122
71
123
Implementations are advised to signal a warning for every use of an `OPEN` pattern binder class if the two endpoints of the pattern element are both unbound and both use the same variable name.
72
124
@@ -78,148 +130,136 @@ This CIP proposes introducing the notion of a *pattern match mode* that may be w
78
130
79
131
A pattern match mode is always written before any pattern binder class that has been explicitly given for the same pattern binder.
80
132
81
-
==== MATCH EVERY mode
82
-
83
-
This CIP proposes the new `MATCH EVERY` pattern match mode that matches every walk (or trail, or path respectively) as described by all node, relationship, and path query patterns given in the following pattern elements.
84
-
This may return an infinite or at least a very large result for some graphs.
85
-
86
-
Implementations are advised to signal a warning for every use of `MATCH EVERY (OPEN|CLOSED) WALK` that may lead to the generation of an infinite result set.
87
-
88
-
==== MATCH SHORTEST mode
89
-
90
-
This CIP proposes the new `MATCH SHORTEST` pattern match mode that matches every _shortest_ walk (or trail, or path respectively) as described by all node, relationship, and path query patterns in the following pattern elements.
91
-
92
-
This CIP proposes to deprecate the existing syntax for both `shortestPath` and `allShortestPaths` matching of Cypher.
93
-
94
-
==== Weight declarations
95
-
96
-
This CIP proposes that pattern elements may optionally be followed by weight declarations of one of the following forms:
97
-
98
-
* `WEIGHT <numerical-aggregation> OVER <rel> AS <weight>` Calculates a weight `<weight>` by evaluating the given `<numerical-aggregation>` for each relationship `<rel>` in the associated match
99
-
* `WEIGHT |<expr>| AS <weight>` Calculates a weight `<weight>` by summing the results of evaluating `abs(<expr>)` for each relationship `r` in the associated match in a special scope that only contains all properties of `r` as variables
133
+
==== Matching node patterns
100
134
101
-
Multiple weight declarations may be given as long as they do not define the same `<weight>` variable.
135
+
A node pattern always matches all described nodes from the graph.
102
136
103
-
==== MATCH CHEAPEST mode
137
+
Different pattern match modes do not influence the set of matched nodes.
104
138
105
-
This CIP proposes the new `MATCH CHEAPEST` pattern match mode that matches every cheapest walk (or trail, or path respectively) as described by all node, relationship, and path query patterns given in the following pattern element and according to the pattern element's concluding first _mandatory_ weight declaration.
139
+
==== MATCH ALL mode
106
140
107
-
==== Mandatory weight declarations
141
+
This CIP proposes the new `MATCH ALL` pattern match mode that matches every walk (or trail, or path respectively) as described by all node, relationship, and path patterns given in the following pattern elements.
108
142
109
-
A mandatory weight declaration is prefixed with `BY`, may omit specifying a variable name for the computed weight, and it's aggregation must be monotone (i.e. the sequence of intermediary results obtained by computing the aggregation incrementally over all input values in any order is always monotonically increasing).
143
+
`MATCH ALL` may only be used in conjunction with a binder class in plural form (i.e. `WALKS`, `TRAILS`, `PATHS`).
110
144
111
-
A conforming implementation is expected to raise a runtime error when the monotonicity of a mandatory weight declaration is violated at runtime.
145
+
This CIP proposes that an error should be raised for any use of `MATCH ALL` without an explicit binder class in combination with variable length relationship or path patterns.
112
146
113
-
A conforming implementation may raise a compile time error when it can statically prove that the monotonicity of a mandatory weight declaration may be violated at runtime.
147
+
Implementations are advised to signal a warning for any use of `MATCH ALL (OPEN|CLOSED) WALKS` that may return an infinite or prohibitively large result.
114
148
115
-
Additional weight declarations may be given after a mandatory weight declaration as long as no two weight declarations define conflicting aliases.
149
+
==== MATCH ALL SHORTEST mode
116
150
117
-
==== Singular matches
151
+
This CIP proposes the new `MATCH ALL SHORTEST` pattern match mode that matches every _shortest_ walk (or trail, or path respectively) as described by all node, relationship, and path patterns in the following pattern elements.
118
152
119
-
This CIP proposes optionally prefixing pattern match modes and pattern binder classes with the `ONE [OF]` marker to support returning at most one match.
153
+
`MATCH ALL SHORTEST` may only be used in conjunction with a binder class in plural form (i.e. `WALKS`, `TRAILS`, `PATHS`).
120
154
121
-
=== Multiple pattern parts
122
-
123
-
If a pattern consists of multiple pattern parts, they are first solved independently before returning their cross product as the final result of the pattern.
155
+
==== MATCH SHORTEST mode
124
156
125
-
=== Default pattern matching semantics
157
+
This CIP proposes the new `MATCH SHORTEST` pattern match mode that matches one _shortest_ walk (or trail, or path respectively) as described by all node, relationship, and path patterns in the following pattern elements.
126
158
127
-
This CIP defines three classes of pattern parts:
159
+
`MATCH SHORTEST` may only be used in conjunction with a binder class in singular form (i.e. `WALK`, `TRAIL`, `PATH`).
128
160
129
-
* *Fixed length pattern parts* are top-level pattern parts that may consist of node patterns or single length relationship patterns only.
130
-
* *Variable length pattern parts* are top-level pattern parts that may consist of node patterns, single length relationship patterns, or path query patterns only.
131
-
* *Legacy variable length pattern parts* are top-level pattern parts that may consist of node patterns, single length relationship patterns, or path query patterns and contain at least one legacy variable length pattern (including chains of single length patterns expressed as bounded variable length patterns).
161
+
=== Default MATCH mode
132
162
133
-
Current Cypher pattern matching semantics correspond to using `MATCH EVERY TRAIL` by default for all top-level pattern parts (i.e. `MATCH` behaves like `MATCH EVERY TRAIL`)
163
+
This CIP proposes a new default pattern match mode that assigns a different pattern match mode to each type of pattern element:
134
164
135
-
This CIP proposes to adopt the following new default pattern match modes and default pattern binder classes:
165
+
* Simple relationship patterns (e.g. `()-[]->()`) are to be matched using `MATCH ALL` (which is identical to `MATCH ALL SHORTEST` for simple relationship patterns)
166
+
* Bounded variable length relationship patterns (e.g. `()-[*2..4]->()`) are to be matched using `MATCH ALL`
167
+
* Unbounded variable length relationship patterns (e.g. `()-[*]->()`) are to be matched using `MATCH ALL`
168
+
* Path patterns (e.g. `()-/../->()`) are to be matched using `MATCH ALL SHORTEST`
136
169
137
-
* `EVERY WALK` for fixed length pattern parts,
138
-
* `SHORTEST WALK` for variable length pattern parts, and
139
-
* `EVERY TRAIL` for legacy variable length pattern parts only.
170
+
This CIP proposes that an error should be raised for any use of the default pattern match mode without an explicit binder class in combination with variable length relationship patterns.
140
171
141
-
This CIP aligns with the introduction of path query patterns by proposing that existing bounded and unbounded variable length patterns are to be deprecated in favor of path query patterns.
172
+
The default pattern match mode may only be used in conjunction with a binder class in plural form (i.e. `WALKS`, `TRAILS`, `PATHS`).
142
173
143
-
This changes Cypher to use homomorphic matching for all non-deprecated pattern parts.
174
+
This changes Cypher to use homomorphic matching for simple relationship patterns.
144
175
145
176
=== Predicates and functions for working with walks
146
177
147
178
This CIP proposes to introduce additional predicates and functions for working with walks
148
179
149
-
* `open(p)`: true if the start node and the end node of `p` are not the same node
150
-
* `closed(p)`: true if the start node and the end node of `p` are the same node
151
-
* `trail(p)`: `p` if `p` contains no duplicate relationships, `NULL` otherwise
152
-
* `path(p)`: `p` if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node, `NULL` otherwise
153
-
* `circuit(p)`: `trail(p)`, if `closed(p)` is true, `NULL` otherwise
154
-
* `cycle(p)`: `path(p)`, if `closed(p)` is true, `NULL` otherwise
180
+
* `isOpen(p)`: true if the start node and the end node of `p` are not the same node
181
+
* `isClosed(p)`: true if the start node and the end node of `p` are the same node
182
+
* `toTrail(p)`: `p` if `p` contains no duplicate relationships, `NULL` otherwise
183
+
* `toPath(p)`: `p` if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node, `NULL` otherwise
184
+
* `toCircuit(p)`: return `toTrail(p)` if `closed(p)` is true, `NULL` otherwise
185
+
* `toCycle(p)`: returns `toPath(p)` if `closed(p)` is true, `NULL` otherwise
155
186
* `disjoint(list1, list2, ..., list_n)` is true if the lists do not share any elements
156
187
157
-
To support a common family of weight calculations, this CIP proposes the introduction of a new aggregate function `product` for computing the product of a set of numbers.
158
-
159
-
Evaluating `product` for an empty set returns `1`.
160
-
161
-
== Examples
162
-
163
-
The following examples demonstrates various ways in which the newly proposed constructs may be used if this CIP is adopted.
188
+
=== Multiline patterns
164
189
165
-
=== Matching shortest paths
190
+
Finally, this CIP proposes additional syntax for splitting a pattern binding accross multiple lines:
166
191
167
192
[source=cypher]
168
193
----
169
-
// shortestPath(...) today becomes:
170
-
MATCH ONE SHORTEST [TRAIL] p=(a)-[r*]->(b)
194
+
MATCH p=(a)-/~very_long_path_pattern/->(b)-/~another-long_path_pattern/->(c)
171
195
RETURN *
172
-
173
-
// allShortestPaths(...) today becomes:
174
-
MATCH SHORTEST [TRAIL] p=(a)-[r*]->(b)
175
-
RETURN p
176
196
----
177
197
178
-
=== Matching cheapest paths
198
+
may be split as:
179
199
180
200
[source=cypher]
181
201
----
182
-
MATCH CHEAPEST PATH p=(a)-/(:LOVES|:LIKES)*/->(b) BY WEIGHT |strength| AS w
183
-
RETURN p AS path, w AS weight
202
+
MATCH p=(a)-/~very_long_path_pattern/->(b)
203
+
+ (b)-/~another-long_path_pattern/->(c)
204
+
RETURN *
184
205
----
185
206
186
-
=== Matching one path and computing its weight
207
+
This additional syntax is necessary due to the changes uniqueness scoping rules for pattern binders.
208
+
Splitting the pattern using `,` instead of the proposed `+` would have changed the result by only binding the first part of the pattern to `p`.
209
+
210
+
== Examples
211
+
212
+
The following examples demonstrates various ways in which the newly proposed constructs may be used if this CIP is adopted.
213
+
214
+
=== Matching shortest paths
187
215
188
216
[source=cypher]
189
217
----
190
-
MATCH ONE PATH p=(a)-[*]->(b) WEIGHT product(r.score+r.handicap) OVER r AS w
191
-
RETURN p, w
218
+
// MATCH p=shortestPath((a)-[:X*]->()) today becomes:
219
+
MATCH SHORTEST TRAIL p=(a)-[:X*]->()
220
+
RETURN *
221
+
222
+
// MATCH p=shortestPaths((a)-[:X*]->()) may be approximated using path patterns:
223
+
MATCH SHORTEST p=(a)-/:X*/->()
224
+
RETURN *
225
+
226
+
// MATCH p=allShortestPaths((a)-[:X*]->()) today becomes:
227
+
MATCH ALL SHORTEST TRAILS p=(a)-[:X*]->()
228
+
RETURN *
229
+
230
+
// MATCH p=allShortestPaths((a)-[:X*]->()) today may be approximated using path patterns:
231
+
MATCH p=(a)-/:X*/->()
232
+
RETURN *
192
233
----
193
234
194
235
=== Matching with existing semantics
195
236
196
-
`overlap` may be used to precisely express Cypher's current pattern matching semantics.
237
+
`disjoint` may be used to precisely express Cypher's current pattern matching semantics.
197
238
198
239
[source=cypher]
199
240
----
200
241
// Today (using same uniqueness scope for pat1, pat2, and pat)
201
242
MATCH pat1=..., pat2=..., pat3=...
202
243
203
244
// This CIP
204
-
MATCH EVERY TRAIL pat1=...
205
-
MATCH EVERY TRAIL pat2=...
206
-
MATCH EVERY TRAIL pat3=...
245
+
MATCH pat1=...
246
+
MATCH pat2=...
247
+
MATCH pat3=...
207
248
WHERE disjoint(rels(pat1), rels(pat2), rels(pat3))
208
249
----
209
250
210
-
== Per-parser options
251
+
== Pre-parser options
252
+
253
+
It is suggested that a conforming implementation should provide pre-parser options for defining the default pattern binder class as well as the default pattern match mode:
211
254
212
-
It is suggested that a conforming implementation should provide pre-parser options for defining the default pattern binder class for each pattern match mode as well as the default pattern match mode for each class of pattern parts:
255
+
for each pattern match mode as well as the default pattern match mode for each class of pattern parts:
213
256
214
-
* `match-every=walk|trail|path` for configuring the default pattern binder class for each use of the `MATCH EVERY` pattern match mode
215
-
* `match-shortest=walk|trail|path` for configuring the default pattern binder class for each use of the `MATCH SHORTEST` pattern match mode
216
-
* `match-cheapest=walk|trail|path` for configuring the default pattern binder class for each use of the `MATCH CHEAPEST` pattern match mode
217
-
* `fixlen-mode=every|shortest` for configuring the default pattern match mode of fixed length pattern parts
218
-
* `varlen-mode=every|shortest` for configuring the default pattern match mode of variable length pattern parts
257
+
* `binder-class=walk[s]|trail[s]|path[s]` for configuring a different default pattern binder class
258
+
* `match-mode=all|all-shortest|shortest` for configuring a different default pattern match mode
219
259
220
260
== Benefits to this proposal
221
261
222
-
This proposal adds a generic facility to Cypher for expressing desired pattern matching semantics.
262
+
This proposal adds a facility to Cypher for selecting from multiple desirable pattern matching semantics.
223
263
224
264
== Caveats to this proposal
225
265
@@ -228,4 +268,4 @@ A moderate increase in language complexity.
228
268
A substantial departure from current pattern matching semantics.
229
269
However, care has been taken to retain access to current semantics.
230
270
231
-
`MATCH EVERY [OPEN|CLOSED] WALK` allows for non-terminating queries.
271
+
`MATCH ALL [OPEN|CLOSED] WALKS` allows for non-terminating queries.
0 commit comments