Skip to content

Commit 50e7ac3

Browse files
author
Petra Selmer
committed
Added a detailed example; fix-ups; language changes
1 parent 40713bb commit 50e7ac3

File tree

2 files changed

+75
-15
lines changed

2 files changed

+75
-15
lines changed

cip/1.accepted/CIP2017-01-18-configurable-pattern-matching-semantics.adoc

Lines changed: 75 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ This proposal is a response to link:https://github.com/opencypher/openCypher/iss
1111
== Motivation
1212

1313
Currently Cypher uses pattern matching semantics that treats _all_ patterns that occur in a `MATCH` clause as a unit (called a *uniqueness scope*) and only considers pattern instances that bind different relationships to each fixed length relationship pattern variable and to each element of a variable length relationship pattern variable.
14-
This has come to be called *cypermorphism* informally and is a form of edge isomorphism that is based on Cypher's notion of uniqueness scope.
14+
This has come to be called *Cypermorphism* informally and is a form of edge isomorphism that is based on Cypher's notion of uniqueness scope.
1515

1616
Cyphermorphism lies at the intersection of returning as many results as possible while still ruling out returning an infinite number of paths when matching graphs that contain cycles.
1717

@@ -117,25 +117,25 @@ While Cypher allows omitting path, node, and relationship variables in a pattern
117117

118118
This CIP proposes to replace the notion of *uniqueness scope* and *Cyphermorphism* and all associated rules with new, configurable pattern matching semantics.
119119

120-
Likewise to what is proposed by *CIP2017-02-06 Path Pattern Queries*, support for binding relationship list variables in variable length patterns will be deprecated.
120+
As proposed in *CIP2017-02-06 Path Pattern Queries*, support for binding relationship list variables in variable length patterns will be deprecated.
121121

122122
This CIP proposes to deprecate the existing syntax for both `shortestPath` and `allShortestPaths` matching of Cypher.
123123

124124
=== Basic pattern matching semantics
125125

126126
Each pattern consists of one or more top-level pattern parts that are given in a comma separated list.
127127

128-
.Query 3.2.1
128+
.Query 3.3.1
129129
[source,cypher]
130130
----
131131
MATCH (a)-->(b), (c)<--(d)
132132
RETURN *
133133
----
134134

135135
The solution (set of successful matches) of a pattern is the cross product over the solutions of all its top-level pattern parts.
136-
Thus, if we ignore uniqueness, Query 3.2.1 is semantically identical to Query 3.2.2.
136+
Thus, if we ignore uniqueness, Query 3.3.1 is semantically identical to Query 3.3.2.
137137

138-
.Query 3.2.2
138+
.Query 3.3.2
139139
[source,cypher]
140140
----
141141
MATCH (a)-->(b)
@@ -146,16 +146,16 @@ RETURN *
146146
----
147147

148148
Binding several nodes or relationships in a pattern to the same variable describes an implicit join.
149-
Thus, queries 3.2.3 and 3.2.4 are semantically identical.
149+
Thus, queries 3.3.3 and 3.3.4 are semantically equivalent.
150150

151-
.Query 3.2.3
151+
.Query 3.3.3
152152
[source,cypher]
153153
----
154154
MATCH (a)-->()<--(a)-->(b)
155155
RETURN a
156156
----
157157

158-
.Query 3.2.4
158+
.Query 3.3.4
159159
[source,cypher]
160160
----
161161
MATCH (n1)-->(n2), (n3)<--(n4), (n5)-->(b)
@@ -209,7 +209,7 @@ The following additional pattern variable classes are proposed to accommodate ex
209209
* `CYCLE` is a synonym for `CLOSED PATH`
210210

211211
Additionally, this CIP proposes to allow for plural forms of all pattern variable classes, which will be synonymous with their respective singular forms.
212-
The plural forms are as follows: `WALKS`, `TRAILS`, `PATHS`, `CIRCUITS`, `CYCLES`.
212+
The plural forms are as follows: `WALKS`, `TRAILS`, `PATHS`, `CIRCUITS`, and `CYCLES`.
213213
The main motivation is to aid readability when used in conjunction with different pattern match modes (see <<modes>>).
214214

215215
[[modes]]
@@ -254,38 +254,98 @@ The following examples demonstrates various ways in which the newly proposed con
254254

255255
The following graph is used:
256256

257-
image::DataGraph.jpg[Graph,600,600]
257+
image::DataGraph.jpg[Graph,800,700]
258258

259259
=== Homomorphic matching using walks
260260

261261
We'll illustrate the benefits of the new homomorphic pattern matching by means of a series of queries.
262262

263263
Assume we wish to know which two people have grandchildren in common, as well as the names of the grandchildren.
264-
Intuitively, we can see that the only two people in the graph have grandchildren in common, namely _Michael Redgrave_ and _Rachel Kempson_, and that there are two grandchildren, _Natasha Richardson_ and _Jemma Redgrave_.
264+
Intuitively, we can see that the only two people in the graph who have grandchildren in common are _Michael Redgrave_ and _Rachel Kempson_, and that there are two grandchildren, _Natasha Richardson_ and _Jemma Redgrave_.
265+
Although _Roy Redgrave_ is a grandfather, there is no one else in the graph who has grandchildren in common with him.
265266

267+
.Query 4.1.1: Current semantics: single patterns
268+
[source,cypher]
269+
----
270+
MATCH (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
271+
(grandparent2:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
272+
WHERE grandparent1 <> grandparent2
273+
RETURN grandparent1.name, grandparent2.name, grandchild.name
274+
----
266275

276+
Query 4.1.1 comprises two comma-separated matches which form a single pattern, `p~1~`.
277+
As the query runs under the current semantics, relationship uniqueness (aka Cyphermorphism) is applied to `p~1~`.
278+
This means that the `:HAS_CHILD` relationship given by `()-[:HAS_CHILD]->(grandchild)` is only traversed once, which results in no rows being returned.
267279

280+
.Query 4.1.2: Current semantics: breaking the pattern to prevent the effects of Cyphermorphism
281+
[source,cypher]
282+
----
283+
MATCH (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
284+
MATCH (grandparent2:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
285+
WHERE grandparent1 <> grandparent2
286+
RETURN grandparent1.name, grandparent2.name, grandchild.name
287+
----
288+
289+
By splitting out the matches using a separate `MATCH` clause, Query 4.1.2 in effect considers two patterns, `p~2~` and `p~3~`.
290+
Cyphermorphism is applied to `p~2~` and `p~3~` separately, which resolves the limitation inherent in Query 4.1.1.
291+
292+
Running Query 4.1.2 returns the following results:
293+
294+
[queryresult]
295+
----
296+
+------------------------------------------------------------+
297+
| grandparent1.name | grandparent2.name | grandchild.name |
298+
+------------------------------------------------------------+
299+
| Michael Redgrave | Rachel Kempson | Natasha Richardson |
300+
| Michael Redgrave | Rachel Kempson | Jemma Redgrave |
301+
| Rachel Kempson | Michael Redgrave | Natasha Richardson |
302+
| Rachel Kempson | Michael Redgrave | Jemma Redgrave |
303+
+------------------------------------------------------------+
304+
4 rows
305+
----
306+
307+
.Query 4.1.3: New semantics: achieving homomorphism by default
308+
309+
The method to achieve homomorphism as exemplified by Query 4.1.2 is undoubtedly effective, but is potentially unintuitive and contrived.
310+
In contrast, Query 4.1.3 uses the new default semantics for simple relationship patterns, and achieves the desired result without the need to consciously manipulate the structure of the matching clause.
268311

269312
[source,cypher]
270313
----
271-
MATCH [ALL WALKS] (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
314+
MATCH ALL WALKS (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
272315
(grandparent2:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
273316
WHERE grandparent1 <> grandparent2
274317
RETURN grandparent1.name, grandparent2.name, grandchild.name
275318
----
276319

277-
Equivalent to `MATCH ALL WALKS`
278-
279320
[queryresult]
280321
----
281322
+------------------------------------------------------------+
282323
| grandparent1.name | grandparent2.name | grandchild.name |
283324
+------------------------------------------------------------+
284325
| Michael Redgrave | Rachel Kempson | Natasha Richardson |
285326
| Michael Redgrave | Rachel Kempson | Jemma Redgrave |
327+
| Rachel Kempson | Michael Redgrave | Natasha Richardson |
328+
| Rachel Kempson | Michael Redgrave | Jemma Redgrave |
286329
+------------------------------------------------------------+
287-
2 rows
330+
4 rows
331+
----
332+
333+
We could omit `ALL WALKS` from Query 4.1.3, as these are the default pattern match mode and pattern variable class, respectively, for simple relationship patterns.
334+
335+
.Query 4.1.4: New semantics: achieving Cyphermorphism by default
336+
337+
What happens in the scenarios where the current semantics -- i.e. Cyphermorphism -- are desirable?
338+
All that is required is to alter the pattern variable class in the `MATCH` clause from `WALKS` to `TRAILS` (or to just add `TRAILS` if no pattern variable class was previously specified).
339+
340+
[source,cypher]
288341
----
342+
MATCH ALL TRAILS (grandparent1:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild),
343+
(grandparent2:Person)-[:HAS_CHILD]->()-[:HAS_CHILD]->(grandchild)
344+
WHERE grandparent1 <> grandparent2
345+
RETURN grandparent1.name, grandparent2.name, grandchild.name
346+
----
347+
348+
Running Query 4.1.4 will return no results.
289349

290350
=== Matching shortest paths
291351

cip/1.accepted/DataGraph.jpg

14.8 KB
Loading

0 commit comments

Comments
 (0)