Verdict: GO — TerminusDB v12 is viable as a graph storage layer for KOI knowledge graph.
Both Phase 0a (single-instance) and Phase 0b (two-instance federation) passed all tests.
| Metric | Result | Threshold |
|---|---|---|
| Import fidelity | 830/830 entities, 114/114 assertions | Exact match |
| Assertion hash idempotency | Identical on re-compute | Same hashes |
| P95 query latency | 21.9ms | < 100ms |
| Import time | 0.9s (fresh), 9.0s (with re-import) | < 60s |
| RAM | 60.7 MiB | < 512 MiB |
| Schema hash stability | Identical across fresh instances | No false drift |
| # | Test | Result |
|---|---|---|
| 1 | Non-conflicting assertions merge | PASS |
| 2 | Conflicting literals (both preserved) | PASS |
| 3 | New entity on branch | PASS |
| 4 | New assertion linking entities | PASS |
| 5 | SameAs mapping | PASS |
| 6 | Diff/history readability | PASS |
| 7 | Time-travel queries | PASS |
| 8 | Conflict detection query | PASS |
| 9 | Status transition validation | PASS |
| 10 | Schema canonicalization | PASS |
| Metric | Result |
|---|---|
| Schema parity after clone | Identical hashes |
| Data replication | All entities transferred |
| Divergent edits | Both instances can edit independently |
| Document transfer | Assertion hashes consistent across instances |
| Schema divergence detection | SchemaVersionMismatch raised correctly |
| Combined RAM | 102.8 MiB (two instances) |
- Branch/merge (rebase) — clean, deterministic. Non-conflicting changes merge seamlessly.
- Conflict preservation — conflicting assertions (e.g., founded_year: 2017 vs 2018) are BOTH preserved after merge. No silent overwrites.
- LexicalKey uniqueness — deterministic document IDs from field values work correctly for deduplication.
- Time-travel — querying at earlier commits returns correct historical state.
- Schema canonicalization — stable hash across fresh instances, no false drift.
- Performance — sub-30ms P95 queries, sub-1s import for 830 entities + 114 assertions, only 61 MiB RAM.
- Python SDK
clonedb()/push()/pull()— has a variable scoping bug (headers) when connecting between local Docker instances. Workaround: manual document transfer (which validates hash consistency). May work correctly with actual remote URLs. - Diff API —
client.diff()requires documents with the same@id; cannot diff commit objects directly. Commit history provides change tracking. - TerminusDB uses "rebase" not "merge" — replays original commits rather than creating merge commits. Functionally equivalent but commit history looks different from git merge.
listtype not supported — TerminusDB requiresSet[str]from typing, not barelistin schema definitions.
- Assertion model with deterministic hash: Works. Same fact → same hash, idempotent re-import.
- Canonical object serialization:
normalize_literal()prevents false conflicts (e.g.,02017vs2017). - Status lifecycle: Transition rules enforced in code. Terminal states (superseded, retracted) have no outbound transitions.
- Schema version tracking:
SchemaVersion+compute_schema_hash()provides reliable drift detection. - Conflict = query, not persistence: Grouping active assertions by (subject, predicate) and comparing canonical object keys correctly identifies conflicts.
| File | Purpose |
|---|---|
schema.py |
TerminusDB schema + canonical functions |
import_from_postgres.py |
PostgreSQL → TerminusDB import |
test_merge.py |
10 single-instance tests |
test_federation.py |
8 two-instance tests |
run_phase0.sh |
Test harness (0a/0b/all/clean) |
results.json |
Machine-readable metrics |
# Run Phase 0a (single instance)
bash scripts/terminusdb/run_phase0.sh 0a --fresh
# Run Phase 0b (two instances, requires 0a pass)
bash scripts/terminusdb/run_phase0.sh 0b
# Run both
bash scripts/terminusdb/run_phase0.sh all --fresh
# Cleanup Docker containers
bash scripts/terminusdb/run_phase0.sh cleanStatus: PASS — Validated 2026-02-25.
Phase 1 wires TerminusDB into the live personal_ingest_api.py via an outbox pattern:
PG writes are authoritative, outbox rows are enqueued in the same transaction,
and an async worker drains them to TerminusDB.
register-entity / sync-relationships
│
▼
┌─────────────────────┐
│ PostgreSQL (PG) │ ← authoritative
│ + terminusdb_outbox│ ← enqueued in same txn
└─────────┬───────────┘
│ (async, 2s poll)
▼
┌─────────────────────┐
│ outbox_worker.py │ claim → apply → mark applied
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ TerminusDB │ ← graph mirror
└─────────────────────┘
| Property | Implementation |
|---|---|
| Fail-open | PG writes always succeed; outbox accumulates when TDB is down |
| Recovery | Worker retries with exponential backoff (jitter); drains backlog on TDB restart |
| Idempotency | LexicalKey on Entity(rid) / Assertion(assertion_hash) — same doc updated in place |
| Conflict detection | /graph/conflicts groups assertions by (subject, predicate), surfaces multi-value divergence |
| Auth guard | /graph/* endpoints restricted to localhost + WG 10.100.0.0/24 |
| Schema guard | Adapter checks schema hash on startup; fails fast on mismatch |
| Reconciliation | reconcile.py compares PG↔TDB counts and detects drift |
# Quick validation (uses existing TDB data)
bash scripts/terminusdb/smoke_phase1.sh
# Fresh: drop TDB, reimport from PG, then test
bash scripts/terminusdb/smoke_phase1.sh --freshTests: preflight, import, health, entity registration, outbox drain, auth guard, fail-open + recovery, reconciliation.
| File | Role |
|---|---|
api/personal_ingest_api.py |
API + outbox write points + /graph/* endpoints |
api/terminusdb_adapter.py |
Adapter with schema guard + idempotent upserts |
api/vault_parser.py |
Relationship sync (SAVEPOINT isolation for FK failures) |
scripts/terminusdb/outbox_worker.py |
Async worker draining outbox → TDB |
scripts/terminusdb/schema.py |
Entity(rid), Assertion schema |
scripts/terminusdb/import_from_postgres.py |
Bulk import with --fresh flag |
scripts/terminusdb/reconcile.py |
Drift detection + --repair |
scripts/terminusdb/smoke_phase1.sh |
Automated smoke test |
migrations/048_terminusdb_outbox.sql |
Outbox table DDL |
cd /Users/darrenzal/projects/RegenAI/koi-processor
# API (Terminal 1)
( set -a; source config/personal.env; set +a
venv/bin/uvicorn api.personal_ingest_api:app --host 0.0.0.0 --port 8351 )
# Worker (Terminal 2)
( set -a; source config/personal.env; set +a
venv/bin/python -m scripts.terminusdb.outbox_worker )Important: Use set -a; source config/personal.env; set +a (not export $(grep ...))
to safely load env vars with spaces/quotes.
FK constraint violations on pending_relationships (e.g., unregistered predicate)
aborted the entire PG transaction, causing subsequent outbox enqueue to fail with
InFailedSQLTransactionError. Fixed by wrapping relationship INSERTs in
SAVEPOINT/ROLLBACK TO SAVEPOINT so individual failures are isolated without
aborting the enclosing transaction.
| Metric | Value |
|---|---|
| Entities (PG=TDB) | 833 |
| Assertions (PG=TDB) | 114 |
| Drift | 0 |
| Schema hash | 206e7ff0a60f... |
| Worker recovery time | < 15s after TDB restart |
| Auth: localhost | 200 |
| Auth: LAN IP | 403 |