You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: cip/1.accepted/CIP2017-01-18-configurable-pattern-matching-semantics.adoc
+75-15Lines changed: 75 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ This proposal is a response to link:https://github.com/opencypher/openCypher/iss
11
11
== Motivation
12
12
13
13
Currently Cypher uses pattern matching semantics that treats _all_ patterns that occur in a `MATCH` clause as a unit (called a *uniqueness scope*) and only considers pattern instances that bind different relationships to each fixed length relationship pattern variable and to each element of a variable length relationship pattern variable.
14
-
This has come to be called *cypermorphism* informally and is a form of edge isomorphism that is based on Cypher's notion of uniqueness scope.
14
+
This has come to be called *Cypermorphism* informally and is a form of edge isomorphism that is based on Cypher's notion of uniqueness scope.
15
15
16
16
Cyphermorphism lies at the intersection of returning as many results as possible while still ruling out returning an infinite number of paths when matching graphs that contain cycles.
17
17
@@ -117,25 +117,25 @@ While Cypher allows omitting path, node, and relationship variables in a pattern
117
117
118
118
This CIP proposes to replace the notion of *uniqueness scope* and *Cyphermorphism* and all associated rules with new, configurable pattern matching semantics.
119
119
120
-
Likewise to what is proposed by *CIP2017-02-06 Path Pattern Queries*, support for binding relationship list variables in variable length patterns will be deprecated.
120
+
As proposed in *CIP2017-02-06 Path Pattern Queries*, support for binding relationship list variables in variable length patterns will be deprecated.
121
121
122
122
This CIP proposes to deprecate the existing syntax for both `shortestPath` and `allShortestPaths` matching of Cypher.
123
123
124
124
=== Basic pattern matching semantics
125
125
126
126
Each pattern consists of one or more top-level pattern parts that are given in a comma separated list.
127
127
128
-
.Query 3.2.1
128
+
.Query 3.3.1
129
129
[source,cypher]
130
130
----
131
131
MATCH (a)-->(b), (c)<--(d)
132
132
RETURN *
133
133
----
134
134
135
135
The solution (set of successful matches) of a pattern is the cross product over the solutions of all its top-level pattern parts.
136
-
Thus, if we ignore uniqueness, Query 3.2.1 is semantically identical to Query 3.2.2.
136
+
Thus, if we ignore uniqueness, Query 3.3.1 is semantically identical to Query 3.3.2.
137
137
138
-
.Query 3.2.2
138
+
.Query 3.3.2
139
139
[source,cypher]
140
140
----
141
141
MATCH (a)-->(b)
@@ -146,16 +146,16 @@ RETURN *
146
146
----
147
147
148
148
Binding several nodes or relationships in a pattern to the same variable describes an implicit join.
149
-
Thus, queries 3.2.3 and 3.2.4 are semantically identical.
149
+
Thus, queries 3.3.3 and 3.3.4 are semantically equivalent.
150
150
151
-
.Query 3.2.3
151
+
.Query 3.3.3
152
152
[source,cypher]
153
153
----
154
154
MATCH (a)-->()<--(a)-->(b)
155
155
RETURN a
156
156
----
157
157
158
-
.Query 3.2.4
158
+
.Query 3.3.4
159
159
[source,cypher]
160
160
----
161
161
MATCH (n1)-->(n2), (n3)<--(n4), (n5)-->(b)
@@ -209,7 +209,7 @@ The following additional pattern variable classes are proposed to accommodate ex
209
209
* `CYCLE` is a synonym for `CLOSED PATH`
210
210
211
211
Additionally, this CIP proposes to allow for plural forms of all pattern variable classes, which will be synonymous with their respective singular forms.
212
-
The plural forms are as follows: `WALKS`, `TRAILS`, `PATHS`, `CIRCUITS`, `CYCLES`.
212
+
The plural forms are as follows: `WALKS`, `TRAILS`, `PATHS`, `CIRCUITS`, and `CYCLES`.
213
213
The main motivation is to aid readability when used in conjunction with different pattern match modes (see <<modes>>).
214
214
215
215
[[modes]]
@@ -254,38 +254,98 @@ The following examples demonstrates various ways in which the newly proposed con
254
254
255
255
The following graph is used:
256
256
257
-
image::DataGraph.jpg[Graph,600,600]
257
+
image::DataGraph.jpg[Graph,800,700]
258
258
259
259
=== Homomorphic matching using walks
260
260
261
261
We'll illustrate the benefits of the new homomorphic pattern matching by means of a series of queries.
262
262
263
263
Assume we wish to know which two people have grandchildren in common, as well as the names of the grandchildren.
264
-
Intuitively, we can see that the only two people in the graph have grandchildren in common, namely _Michael Redgrave_ and _Rachel Kempson_, and that there are two grandchildren, _Natasha Richardson_ and _Jemma Redgrave_.
264
+
Intuitively, we can see that the only two people in the graph who have grandchildren in common are _Michael Redgrave_ and _Rachel Kempson_, and that there are two grandchildren, _Natasha Richardson_ and _Jemma Redgrave_.
265
+
Although _Roy Redgrave_ is a grandfather, there is no one else in the graph who has grandchildren in common with him.
265
266
267
+
.Query 4.1.1: Current semantics: single patterns
268
+
[source,cypher]
269
+
----
270
+
MATCH (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
Query 4.1.1 comprises two comma-separated matches which form a single pattern, `p~1~`.
277
+
As the query runs under the current semantics, relationship uniqueness (aka Cyphermorphism) is applied to `p~1~`.
278
+
This means that the `:HAS_CHILD` relationship given by `()-[:HAS_CHILD]->(grandchild)` is only traversed once, which results in no rows being returned.
267
279
280
+
.Query 4.1.2: Current semantics: breaking the pattern to prevent the effects of Cyphermorphism
281
+
[source,cypher]
282
+
----
283
+
MATCH (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
284
+
MATCH (grandparent2:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
.Query 4.1.3: New semantics: achieving homomorphism by default
308
+
309
+
The method to achieve homomorphism as exemplified by Query 4.1.2 is undoubtedly effective, but is potentially unintuitive and contrived.
310
+
In contrast, Query 4.1.3 uses the new default semantics for simple relationship patterns, and achieves the desired result without the need to consciously manipulate the structure of the matching clause.
268
311
269
312
[source,cypher]
270
313
----
271
-
MATCH [ALL WALKS] (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
314
+
MATCH ALL WALKS (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
We could omit `ALL WALKS` from Query 4.1.3, as these are the default pattern match mode and pattern variable class, respectively, for simple relationship patterns.
334
+
335
+
.Query 4.1.4: New semantics: achieving Cyphermorphism by default
336
+
337
+
What happens in the scenarios where the current semantics -- i.e. Cyphermorphism -- are desirable?
338
+
All that is required is to alter the pattern variable class in the `MATCH` clause from `WALKS` to `TRAILS` (or to just add `TRAILS` if no pattern variable class was previously specified).
339
+
340
+
[source,cypher]
288
341
----
342
+
MATCH ALL TRAILS (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
0 commit comments