Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions .metis/backlog/features/GQLITE-T-0138.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
id: pattern-predicates-in-where-clause
level: task
title: "Pattern predicates in WHERE clause (bare relationship patterns as boolean expressions)"
short_code: "GQLITE-T-0138"
created_at: 2026-03-20T15:40:21.527463+00:00
updated_at: 2026-03-20T23:17:50.143661+00:00
parent:
blocked_by: []
archived: false

tags:
- "#task"
- "#feature"
- "#phase/active"


exit_criteria_met: false
initiative_id: NULL
---

# Pattern predicates in WHERE clause (bare relationship patterns as boolean expressions)

## Objective

Support bare relationship patterns as boolean expressions in WHERE clauses, per the openCypher 9 specification (`<PatternPredicate> ::= <RelationshipsPattern>`). Currently GraphQLite requires the explicit `EXISTS(pattern)` form; the spec also allows the shorthand where a relationship pattern in boolean context is implicitly coerced to an existence check.

## Background

**Reported query:**
```cypher
MATCH (n {entity_type: 'Note'})
WHERE NOT (n)-[:BELONGS_TO]->() AND NOT (n)-[:RELATES_TO]->()
RETURN n.id, n.title
```
Fails with: `syntax error, unexpected ':'` at col 48 (the `:` in `-[:BELONGS_TO]->`).

**Workaround:** Wrap in `EXISTS()`:
```cypher
WHERE NOT EXISTS((n)-[:BELONGS_TO]->()) AND NOT EXISTS((n)-[:RELATES_TO]->())
```

**Spec basis:** openCypher 9 defines `PatternPredicate` — a `RelationshipsPattern` in a boolean site is semantically equivalent to `EXISTS { MATCH pattern RETURN 1 }`. Neo4j has supported this since at least v2.x. The pattern must contain at least one relationship to be valid.

## Acceptance Criteria

## Acceptance Criteria

## Acceptance Criteria

- [x] `WHERE (n)-[:REL]->()` parses and evaluates as existence check
- [x] `WHERE NOT (n)-[:REL]->()` parses and evaluates as non-existence check
- [x] Works with all relationship directions: `->`, `<-`, `-`
- [x] Works with typed and untyped relationships
- [x] Works combined with AND/OR/XOR and other predicates
- [x] Bare node pattern `(n)` without a relationship is NOT accepted as a pattern predicate
- [x] Unit tests and functional SQL tests added
- [x] Existing `EXISTS(pattern)` behavior unaffected

## Implementation Notes

### Technical Approach

**Grammar (`cypher_gram.y`):** Add a `pattern_predicate` production to the `expr` rule that matches a relationship pattern (node-rel-node) in expression context. The parser currently only reaches relationship patterns via the `simple_path` rule inside `MATCH`. A new rule would recognize `node_pattern rel_pattern node_pattern` within `expr` and produce an AST node (e.g., `AST_NODE_PATTERN_PREDICATE`).

**Transform:** In `transform_expression()`, handle `AST_NODE_PATTERN_PREDICATE` by reusing the same SQL generation path as `EXISTS(pattern)` — emit a `EXISTS (SELECT 1 FROM ...)` subquery.

**Key concern:** GLR conflicts. Adding relationship patterns to `expr` will likely introduce S/R or R/R conflicts since `(expr)` parenthesized expressions overlap with `(node_pattern)`. Careful precedence/disambiguation will be needed. Consider restricting the rule to require at least one `-[...]->` component to disambiguate.

### Dependencies
- None — the EXISTS(pattern) transform already exists and can be reused.

### Risk Considerations
- Grammar conflicts are the main risk. The GLR parser can handle ambiguity but conflict counts (`%expect`) may need updating and careful testing.

## Status Updates

### 2026-03-20: Implementation complete

**Files modified:**
- `src/backend/parser/cypher_gram.y` — Added two `expr` productions for pattern predicates (3-element and 5-element paths). Updated `%expect` from 4 to 9 S/R conflicts (all GLR-safe ambiguities from `(IDENTIFIER)` being parseable as both `(expr)` and `node_pattern`). R/R conflicts unchanged at 3.
- `src/backend/transform/transform_expr_predicate.c` — Fixed pre-existing bug where `EXISTS(pattern)` always assumed outgoing direction. Now respects `left_arrow`/`right_arrow` flags for incoming (`<-`) and undirected (`-`) relationship patterns.
- `src/generated/cypher_gram.tab.{c,h}` — Regenerated.
- `tests/functional/14_pattern_predicates.sql` — 17 test cases covering all directions, typed/untyped, NOT/AND/OR/XOR combinations, and equivalence with EXISTS().

**Approach:** Pattern predicates reuse the existing `make_exists_pattern_expr()` AST constructor and `transform_exists_expression()` SQL generation. No new AST node types needed — a bare pattern predicate is desugared to an EXISTS expression at parse time.

**Test results:** 5300 unit assertions pass (0 failures), all functional tests pass.
2 changes: 1 addition & 1 deletion bindings/python/src/graphqlite/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from .utils import escape_string, sanitize_rel_type, CYPHER_RESERVED
from ._platform import get_loadable_path

__version__ = "0.3.9"
__version__ = "0.3.10"
__all__ = [
"BulkInsertResult",
"Connection", "connect", "wrap", "load", "loadable_path",
Expand Down
2 changes: 1 addition & 1 deletion bindings/rust/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "graphqlite"
version = "0.3.9"
version = "0.3.10"
edition = "2021"
description = "SQLite extension for graph queries using Cypher"
license = "MIT"
Expand Down
39 changes: 36 additions & 3 deletions src/backend/parser/cypher_gram.y
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ int cypher_yylex(CYPHER_YYSTYPE *yylval, CYPHER_YYLTYPE *yylloc, cypher_parser_c
/*
* Expected grammar conflicts - handled correctly by GLR parsing.
* These arise from pattern comprehension syntax [(...)-[r]->(...) | expr]
* where the parser can't immediately distinguish a node pattern from
* a parenthesized expression until it sees more context.
* and pattern predicates (n)-[:REL]->() in boolean context, where the
* parser can't immediately distinguish a node pattern from a parenthesized
* expression until it sees more context (e.g., a following rel_pattern).
*/
%expect 4
%expect 9
%expect-rr 3 /* One for IDENTIFIER, one for BQIDENT, one for END_P in variable_opt */

%union {
Expand Down Expand Up @@ -1056,6 +1057,38 @@ expr:
| expr IS NULL_P { $$ = (ast_node*)make_null_check($1, false, @2.first_line); }
| expr IS NOT NULL_P { $$ = (ast_node*)make_null_check($1, true, @2.first_line); }
| '(' expr ')' { $$ = $2; }
/* Pattern predicate: bare relationship pattern as boolean expression.
* Per openCypher 9 spec, a RelationshipsPattern in boolean context
* is an implicit existence check: (n)-[:REL]->() ≡ EXISTS((n)-[:REL]->())
* Requires at least one relationship to distinguish from parenthesized expr.
*/
| node_pattern rel_pattern node_pattern
{
/* Build a path from the pattern elements */
ast_list *elements = ast_list_create();
ast_list_append(elements, (ast_node*)$1);
ast_list_append(elements, (ast_node*)$2);
ast_list_append(elements, (ast_node*)$3);
cypher_path *path = make_path(elements);
/* Wrap in pattern list and create EXISTS expression */
ast_list *pattern_list = ast_list_create();
ast_list_append(pattern_list, (ast_node*)path);
$$ = (ast_node*)make_exists_pattern_expr(pattern_list, @1.first_line);
}
| node_pattern rel_pattern node_pattern rel_pattern node_pattern
{
/* Chained pattern: (a)-[r1]->(b)-[r2]->(c) */
ast_list *elements = ast_list_create();
ast_list_append(elements, (ast_node*)$1);
ast_list_append(elements, (ast_node*)$2);
ast_list_append(elements, (ast_node*)$3);
ast_list_append(elements, (ast_node*)$4);
ast_list_append(elements, (ast_node*)$5);
cypher_path *path = make_path(elements);
ast_list *pattern_list = ast_list_create();
ast_list_append(pattern_list, (ast_node*)path);
$$ = (ast_node*)make_exists_pattern_expr(pattern_list, @1.first_line);
}
| expr '.' IDENTIFIER
{
$$ = (ast_node*)make_property($1, $3, @3.first_line);
Expand Down
29 changes: 22 additions & 7 deletions src/backend/transform/transform_expr_predicate.c
Original file line number Diff line number Diff line change
Expand Up @@ -121,13 +121,28 @@ int transform_exists_expression(cypher_transform_context *ctx, cypher_exists_exp
append_sql(ctx, " AND ");
}

/* Join source node with relationship */
int source_node = i / 2;
int target_node = source_node + 1;

append_sql(ctx, "e%d.source_id = %s.id AND e%d.target_id = %s.id",
rel_index, node_aliases[source_node],
rel_index, node_aliases[target_node]);
/* Join source node with relationship, respecting direction */
int left_node = i / 2;
int right_node = left_node + 1;

if (rel->left_arrow && !rel->right_arrow) {
/* Incoming: <-[]- means edge goes right_node -> left_node */
append_sql(ctx, "e%d.source_id = %s.id AND e%d.target_id = %s.id",
rel_index, node_aliases[right_node],
rel_index, node_aliases[left_node]);
} else if (!rel->left_arrow && !rel->right_arrow) {
/* Undirected: -[]- match either direction */
append_sql(ctx, "((e%d.source_id = %s.id AND e%d.target_id = %s.id) OR (e%d.source_id = %s.id AND e%d.target_id = %s.id))",
rel_index, node_aliases[left_node],
rel_index, node_aliases[right_node],
rel_index, node_aliases[right_node],
rel_index, node_aliases[left_node]);
} else {
/* Outgoing: -[]-> (default) */
append_sql(ctx, "e%d.source_id = %s.id AND e%d.target_id = %s.id",
rel_index, node_aliases[left_node],
rel_index, node_aliases[right_node]);
}

/* Add relationship type constraint if specified */
if (rel->type) {
Expand Down
Loading
Loading