Skip to content

OPTIONAL MATCH issues + Continued Development and Support \ Community for Morpheus (Spark 3 support, bugfixes )Β #947

@MarcianoAvihay

Description

@MarcianoAvihay

hi,
i have encountered what i believe is a bug in the optional match implementation , and hoping to get some guidance as to how i could help remediate it (and submit a pull request for all to have this fix) .

the issue is this ->

i have done multiple tests that all lead to the same result -> using OPTIONAL MATCH to match against a non-existent relationship expansion , and afterwards having another OPTIONAL MATCH from the same variable to something that does exist , would never return the latter , and will only return a NULL .

re- example:

test graph ->

val test = morpheus.cypher(
"""
CONSTRUCT
CREATE (p1:Person {name: "Alice"})
CREATE (p2:Person {name: "Bob"})
CREATE (p3:Person {name: "Eve"})
CREATE (p4:Person {name: "Paul"})
CREATE (p1)-[:KNOWS]->(p3)
CREATE (p1)-[:KNOWS2]->(p2)
CREATE (p1)-[:KNOWS3]->(p3)
CREATE (p1)-[:KNOWS4]->(p4)
CREATE (p1)-[:KNOWS5]->(p2)

return GRAPH
""".stripMargin
).graph

--- query -->

val testres = test.cypher(
"""
match (p1:Person )
optional match (p1)-[:KNOWS]->(p2)
optional match (p1)-[:KNOWS15]->(p3)
optional match (p1)-[:KNOWS4]->(p4)
return p1.name, p2.name,p3.name,p4.name
""".stripMargin)

--- yields the following result ->

+-------+-------+-------+-------+
|p1_name|p2_name|p3_name|p4_name|
+-------+-------+-------+-------+
| Bob| null| null| null|
| Eve| null| null| null|
| Paul| null| null| null|
| Alice| Eve| null| null|
+-------+-------+-------+-------+

*** But , if we would change the query order to (having the non existent expansion last):

val testres = test2.cypher(
"""
match (p1:Person )
optional match (p1)-[:KNOWS]->(p2)
optional match (p1)-[:KNOWS4]->(p4)
optional match (p1)-[:KNOWS15]->(p3)
return p1.name, p2.name,p3.name,p4.name
""".stripMargin)

-- we would then get the correct result :
+-------+-------+-------+-------+
|p1_name|p2_name|p3_name|p4_name|
+-------+-------+-------+-------+
| Alice| Eve| null| Paul|
| Bob| null| null| null|
| Eve| null| null| null|
| Paul| null| null| null|

-- which is not expected - since alice is also connected via KNOWS4 to paul.
same query and graph setup in neo4j yields (in any ordering of the query) :

p1.name p2.name p3.name p4.name
"Alice" "Eve" null "Paul"
"Bob" null null null
"Eve" null null null
"Paul" null null null

i found in the OptionalMatchTests.scala in morpheus the following test which doesn't cover the above, and nothing else that does (there is no test in morpheus \ tck that cover doesnt exist - exist :

val g = initGraph(
"""
|CREATE (:DoesExist {property: 42})
|CREATE (:DoesExist {property: 43})
|CREATE (:DoesExist {property: 44})
""".stripMargin)

  val res = g.cypher(
    """
      |OPTIONAL MATCH (f:DoesExist)
      |OPTIONAL MATCH (n:DoesNotExist)
      |RETURN collect(DISTINCT n.property) AS a, collect(DISTINCT f.property) AS b
    """.stripMargin)

can any of the dev's please share they're thoughts as to how hard would it be to try and get to the expected behaviour? and where should i start looking in the solution to try and fix ?

is it a matter of a complex modification to the relation+logical planner to get the required behaviour ? if there is some small tweak that comes to mind it would greatly help me ( i am trying to use morpheus to automate a combination of left outer joins and inner joins needed in a very large datasets

thanks very much !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions