Skip to content

Commit a63b631

Browse files
authored
Multi-graph support with GraphManager and FROM clause (#6)
Add multi-graph support with FROM clause and GraphManager Adds the ability to query across multiple graph databases using SQLite's ATTACH mechanism and Cypher's FROM clause syntax. Core Features: - FROM clause in MATCH queries: `MATCH (n:Person) FROM other_graph RETURN n` - graph() function for provenance tracking: returns source database name - GraphManager API for Python and Rust to manage multiple graph databases - Cross-graph queries via coordinator connection with automatic ATTACH Implementation: - Parser: Added FROM clause recognition in MATCH patterns - AST: New graph_name field on match patterns - SQL Generation: Table prefixing with database schema names - Query Dispatch: Schema context propagation through query pipeline API (Python): - `graphs(path)` context manager for GraphManager - `gm.create/open/drop/list()` for graph lifecycle - `gm.query(cypher, graphs=[...])` for cross-graph Cypher - `gm.query_sql(sql, graphs=[...])` for raw SQL access API (Rust): - `GraphManager::open(path)` with extension auto-discovery - Same lifecycle methods as Python - Comprehensive test coverage matching Python test suite Testing: - New functional test: 31_multigraph_queries.sql (19 test cases) - Python tests: test_manager.py with cross-graph query tests - Rust tests: 80+ new integration tests for feature parity
1 parent fc84558 commit a63b631

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+4949
-176
lines changed

.github/workflows/docs.yml

Lines changed: 20 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,7 @@ name: Documentation
22

33
on:
44
push:
5-
branches: [main]
65
tags: ['v*']
7-
pull_request:
8-
branches: [main]
9-
paths: ['docs/**']
106

117
permissions:
128
contents: read
@@ -34,21 +30,14 @@ jobs:
3430
- name: Get version info
3531
id: version
3632
run: |
37-
if [[ "$GITHUB_REF" == refs/tags/* ]]; then
38-
VERSION="${GITHUB_REF#refs/tags/}"
39-
echo "version=$VERSION" >> $GITHUB_OUTPUT
40-
echo "is_main=false" >> $GITHUB_OUTPUT
41-
else
42-
echo "version=latest" >> $GITHUB_OUTPUT
43-
echo "is_main=true" >> $GITHUB_OUTPUT
44-
fi
33+
VERSION="${GITHUB_REF#refs/tags/}"
34+
echo "version=$VERSION" >> $GITHUB_OUTPUT
4535
4636
- name: Build current docs
4737
working-directory: docs
4838
run: mdbook build
4939

5040
- name: Fetch existing docs (for versioning)
51-
if: github.event_name == 'push'
5241
run: |
5342
mkdir -p _site
5443
# Try to fetch existing gh-pages content
@@ -58,30 +47,32 @@ jobs:
5847
fi
5948
6049
- name: Organize versioned docs
61-
if: github.event_name == 'push'
6250
run: |
6351
VERSION="${{ steps.version.outputs.version }}"
6452
65-
# For main branch, always update /latest/
66-
# For tags, create versioned directory
53+
# Create versioned directory
6754
rm -rf _site/$VERSION
6855
mkdir -p _site/$VERSION
6956
cp -r docs/book/* _site/$VERSION/
7057
71-
# Generate versions.json only from main (tags just add their directory)
72-
if [[ "${{ steps.version.outputs.is_main }}" == "true" ]]; then
73-
cd _site
74-
{
75-
echo "["
76-
echo " \"latest\""
77-
for v in $(ls -d v[0-9]* 2>/dev/null | sort -Vr | head -10); do
78-
echo " ,\"$v\""
79-
done
80-
echo "]"
81-
} > versions.json
58+
# Update latest to point to this version
59+
rm -rf _site/latest
60+
mkdir -p _site/latest
61+
cp -r docs/book/* _site/latest/
62+
63+
# Update versions.json
64+
cd _site
65+
{
66+
echo "["
67+
echo " \"latest\""
68+
for v in $(ls -d v[0-9]* 2>/dev/null | sort -Vr | head -10); do
69+
echo " ,\"$v\""
70+
done
71+
echo "]"
72+
} > versions.json
8273
83-
# Create index redirect to latest
84-
cat > index.html << 'EOF'
74+
# Create index redirect to latest
75+
cat > index.html << 'EOF'
8576
<!DOCTYPE html>
8677
<html>
8778
<head>
@@ -93,16 +84,13 @@ jobs:
9384
</body>
9485
</html>
9586
EOF
96-
fi
9787
9888
- name: Upload artifact
99-
if: github.event_name == 'push'
10089
uses: actions/upload-pages-artifact@v3
10190
with:
10291
path: _site
10392

10493
deploy:
105-
if: github.event_name == 'push'
10694
needs: build
10795
runs-on: ubuntu-latest
10896
environment:

.metis/strategies/NULL/initiatives/GQLITE-I-0022/initiative.md

Lines changed: 87 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@ level: initiative
44
title: "Multi-Graph Support via GraphManager and ATTACH"
55
short_code: "GQLITE-I-0022"
66
created_at: 2025-12-25T20:16:11.339561+00:00
7-
updated_at: 2025-12-25T20:16:11.339561+00:00
7+
updated_at: 2026-01-04T14:34:29.624102+00:00
88
parent: GQLITE-V-0001
99
blocked_by: []
1010
archived: false
1111

1212
tags:
1313
- "#initiative"
14-
- "#phase/discovery"
14+
- "#phase/completed"
1515

1616

1717
exit_criteria_met: false
@@ -68,29 +68,23 @@ Key insight: If you frequently query across "separate" graphs, they're arguably
6868

6969
## Use Cases
7070

71-
### Use Case 1: Multi-Tenant SaaS
72-
- **Actor**: SaaS platform developer
73-
- **Scenario**: Each customer gets isolated graph; platform manages lifecycle
74-
- **Example**: `gm.create("tenant_acme")`, `gm.drop("tenant_acme")`
75-
- **Outcome**: Complete data isolation, easy onboarding/offboarding
76-
77-
### Use Case 2: Test/Production Separation
78-
- **Actor**: Developer
79-
- **Scenario**: Separate graphs for testing without affecting production
80-
- **Example**: `gm.open("prod")` vs `gm.open("test")`
81-
- **Outcome**: Safe testing environment
82-
83-
### Use Case 3: Graph Versioning
71+
### Use Case 1: Graph Versioning
8472
- **Actor**: Data engineer
85-
- **Scenario**: Maintain versioned snapshots during migration
86-
- **Example**: `gm.create("graph_v2")`, migrate, then `gm.drop("graph_v1")`
87-
- **Outcome**: Safe rollback capability
88-
89-
### Use Case 4: Cross-Graph Analytics
90-
- **Actor**: Analyst
91-
- **Scenario**: Correlate entities across domain graphs by shared identifier
92-
- **Example**: Find users in social graph who purchased in products graph
93-
- **Outcome**: Ad-hoc cross-domain queries without merging graphs
73+
- **Scenario**: Maintain versioned snapshots during schema migration or data transformation
74+
- **Example**: `gm.create("graph_v2")`, migrate data, validate, then `gm.drop("graph_v1")`
75+
- **Outcome**: Safe rollback capability, ability to compare before/after states
76+
77+
### Use Case 2: Cross-Graph Analytics
78+
- **Actor**: Analyst or application developer
79+
- **Scenario**: Correlate entities across separate domain graphs by shared identifier
80+
- **Example**: Join a knowledge graph with an activity graph via shared entity IDs
81+
- **Outcome**: Ad-hoc cross-domain queries without merging conceptually separate graphs
82+
83+
### Use Case 3: Domain Isolation in Monolith Applications
84+
- **Actor**: Application developer
85+
- **Scenario**: Single application manages multiple independent graph datasets (e.g., a note-taking app with separate graphs per workspace/project)
86+
- **Example**: `gm.open_or_create("project_alpha")`, `gm.open_or_create("project_beta")`
87+
- **Outcome**: Clean separation of unrelated data within one application
9488

9589
## Architecture **[CONDITIONAL: Technically Complex Initiative]**
9690

@@ -246,22 +240,78 @@ Each graph is a separate `.db` file; use ATTACH for cross-graph.
246240

247241
## Implementation Plan
248242

249-
### Phase 1: GraphManager in Bindings
250-
- Add `GraphManager` to Python bindings (`manager.py`)
251-
- Add `GraphManager` to Rust bindings (`manager.rs`)
252-
- Raw SQL cross-graph via ATTACH (no Cypher changes yet)
253-
- Tests for file management and ATTACH queries
243+
### Phase 1: C Extension - FROM Clause Support
244+
245+
**Parser Changes (`cypher_gram.y`, `cypher_scanner.l`):**
246+
- Add `FROM` token handling in MATCH context (distinguish from LOAD CSV FROM)
247+
- Grammar rule: `match_clause : MATCH pattern_list FROM identifier ...`
248+
- Store graph name in `cypher_match` AST node
249+
250+
**AST Changes (`cypher_ast.h`, `cypher_ast.c`):**
251+
- Add `from_graph` field to `cypher_match` struct
252+
- Update AST construction/destruction functions
253+
- Add accessor: `cypher_match_get_from_graph()`
254+
255+
**Transform Layer (`transform_match.c`):**
256+
- When `from_graph` is set, prefix all table references with graph name
257+
- Example: `nodes``{graph_name}.nodes`, `edges``{graph_name}.edges`
258+
- No connection management - just string prefixing in generated SQL
259+
260+
**Testing:**
261+
- Unit tests for parser accepting/rejecting FROM syntax
262+
- Transform tests verifying correct table prefixing
263+
- Ensure backward compatibility (queries without FROM unchanged)
264+
265+
### Phase 2: Binding Layer - GraphManager
254266

255-
### Phase 2: Cypher FROM Clause
256-
- Add `FROM` token handling in parser (avoid conflict with LOAD CSV FROM)
257-
- Extend `cypher_match` AST with `from_graph` field
258-
- Transform layer prefixes table names when `from_graph` set
259-
- Executor manages coordinator connection with ATTACH
267+
**Python (`graphqlite/manager.py`):**
268+
```python
269+
class GraphManager:
270+
def __init__(self, base_path: str)
271+
def list(self) -> list[str]
272+
def create(self, name: str) -> Graph
273+
def open(self, name: str) -> Graph
274+
def open_or_create(self, name: str) -> Graph
275+
def drop(self, name: str) -> None
276+
def query_cross(self, cypher: str, graphs: list[str]) -> Result
277+
```
278+
- `query_cross` handles ATTACH to coordinator connection
279+
- Connection pooling for open graphs
280+
- Context manager support (`with graphs(...) as gm`)
281+
282+
**Rust (`src/manager.rs`):**
283+
```rust
284+
pub struct GraphManager { ... }
285+
impl GraphManager {
286+
pub fn new(base_path: impl AsRef<Path>) -> Result<Self>
287+
pub fn list(&self) -> Result<Vec<String>>
288+
pub fn create(&mut self, name: &str) -> Result<Graph>
289+
pub fn open(&mut self, name: &str) -> Result<Graph>
290+
pub fn open_or_create(&mut self, name: &str) -> Result<Graph>
291+
pub fn drop(&mut self, name: &str) -> Result<()>
292+
pub fn query_cross(&self, cypher: &str, graphs: &[&str]) -> Result<QueryResult>
293+
}
294+
```
295+
296+
**ATTACH Coordination:**
297+
- Maintain a "coordinator" in-memory connection for cross-graph queries
298+
- Dynamically ATTACH requested graphs before executing
299+
- DETACH after query completes (or pool attached state)
300+
301+
**Testing:**
302+
- File creation/deletion
303+
- Connection lifecycle
304+
- Cross-graph queries via ATTACH
305+
- Error handling (missing graphs, permission errors)
260306

261307
### Phase 3: Documentation & Examples
262-
- Update README with multi-graph examples
263-
- Add examples/multi_graph/ tutorials
264-
- Document cross-graph query patterns and when to use them
308+
309+
- Update README with multi-graph section
310+
- Add `examples/multi_graph/` with:
311+
- Graph versioning workflow
312+
- Cross-graph analytics example
313+
- Document when to use multi-graph vs single graph with better modeling
314+
- API reference for GraphManager in both Python and Rust docs
265315

266316
### Design Decisions
267317

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
---
2+
id: add-from-token-and-grammar-rule-to
3+
level: task
4+
title: "Add FROM token and grammar rule to parser"
5+
short_code: "GQLITE-T-0078"
6+
created_at: 2026-01-03T15:38:26.838253+00:00
7+
updated_at: 2026-01-03T15:55:03.253784+00:00
8+
parent: GQLITE-I-0022
9+
blocked_by: []
10+
archived: false
11+
12+
tags:
13+
- "#task"
14+
- "#phase/completed"
15+
16+
17+
exit_criteria_met: false
18+
strategy_id: NULL
19+
initiative_id: GQLITE-I-0022
20+
---
21+
22+
# Add FROM token and grammar rule to parser
23+
24+
*This template includes sections for various types of tasks. Delete sections that don't apply to your specific use case.*
25+
26+
## Parent Initiative **[CONDITIONAL: Assigned Task]**
27+
28+
[[GQLITE-I-0022]]
29+
30+
## Objective **[REQUIRED]**
31+
32+
{Clear statement of what this task accomplishes}
33+
34+
## Backlog Item Details **[CONDITIONAL: Backlog Item]**
35+
36+
{Delete this section when task is assigned to an initiative}
37+
38+
### Type
39+
- [ ] Bug - Production issue that needs fixing
40+
- [ ] Feature - New functionality or enhancement
41+
- [ ] Tech Debt - Code improvement or refactoring
42+
- [ ] Chore - Maintenance or setup work
43+
44+
### Priority
45+
- [ ] P0 - Critical (blocks users/revenue)
46+
- [ ] P1 - High (important for user experience)
47+
- [ ] P2 - Medium (nice to have)
48+
- [ ] P3 - Low (when time permits)
49+
50+
### Impact Assessment **[CONDITIONAL: Bug]**
51+
- **Affected Users**: {Number/percentage of users affected}
52+
- **Reproduction Steps**:
53+
1. {Step 1}
54+
2. {Step 2}
55+
3. {Step 3}
56+
- **Expected vs Actual**: {What should happen vs what happens}
57+
58+
### Business Justification **[CONDITIONAL: Feature]**
59+
- **User Value**: {Why users need this}
60+
- **Business Value**: {Impact on metrics/revenue}
61+
- **Effort Estimate**: {Rough size - S/M/L/XL}
62+
63+
### Technical Debt Impact **[CONDITIONAL: Tech Debt]**
64+
- **Current Problems**: {What's difficult/slow/buggy now}
65+
- **Benefits of Fixing**: {What improves after refactoring}
66+
- **Risk Assessment**: {Risks of not addressing this}
67+
68+
## Acceptance Criteria
69+
70+
## Acceptance Criteria
71+
72+
## Acceptance Criteria **[REQUIRED]**
73+
74+
- [ ] {Specific, testable requirement 1}
75+
- [ ] {Specific, testable requirement 2}
76+
- [ ] {Specific, testable requirement 3}
77+
78+
## Test Cases **[CONDITIONAL: Testing Task]**
79+
80+
{Delete unless this is a testing task}
81+
82+
### Test Case 1: {Test Case Name}
83+
- **Test ID**: TC-001
84+
- **Preconditions**: {What must be true before testing}
85+
- **Steps**:
86+
1. {Step 1}
87+
2. {Step 2}
88+
3. {Step 3}
89+
- **Expected Results**: {What should happen}
90+
- **Actual Results**: {To be filled during execution}
91+
- **Status**: {Pass/Fail/Blocked}
92+
93+
### Test Case 2: {Test Case Name}
94+
- **Test ID**: TC-002
95+
- **Preconditions**: {What must be true before testing}
96+
- **Steps**:
97+
1. {Step 1}
98+
2. {Step 2}
99+
- **Expected Results**: {What should happen}
100+
- **Actual Results**: {To be filled during execution}
101+
- **Status**: {Pass/Fail/Blocked}
102+
103+
## Documentation Sections **[CONDITIONAL: Documentation Task]**
104+
105+
{Delete unless this is a documentation task}
106+
107+
### User Guide Content
108+
- **Feature Description**: {What this feature does and why it's useful}
109+
- **Prerequisites**: {What users need before using this feature}
110+
- **Step-by-Step Instructions**:
111+
1. {Step 1 with screenshots/examples}
112+
2. {Step 2 with screenshots/examples}
113+
3. {Step 3 with screenshots/examples}
114+
115+
### Troubleshooting Guide
116+
- **Common Issue 1**: {Problem description and solution}
117+
- **Common Issue 2**: {Problem description and solution}
118+
- **Error Messages**: {List of error messages and what they mean}
119+
120+
### API Documentation **[CONDITIONAL: API Documentation]**
121+
- **Endpoint**: {API endpoint description}
122+
- **Parameters**: {Required and optional parameters}
123+
- **Example Request**: {Code example}
124+
- **Example Response**: {Expected response format}
125+
126+
## Implementation Notes **[CONDITIONAL: Technical Task]**
127+
128+
{Keep for technical tasks, delete for non-technical. Technical details, approach, or important considerations}
129+
130+
### Technical Approach
131+
{How this will be implemented}
132+
133+
### Dependencies
134+
{Other tasks or systems this depends on}
135+
136+
### Risk Considerations
137+
{Technical risks and mitigation strategies}
138+
139+
## Status Updates **[REQUIRED]**
140+
141+
*To be added during implementation*

0 commit comments

Comments
 (0)