Samyama is a high-performance, distributed, AI-native graph database written in Rust. It combines a property graph engine, vector search, graph analytics, and natural language querying in a single binary.
Graph Simulation — Cricket KG (36K nodes, 1.4M edges) with live activity particles
Click for full demo (1:56) — Dashboard, Cypher Queries, and Graph Simulation
| Benchmark | Queries | Pass Rate | Dataset |
|---|---|---|---|
| SNB Interactive | 21 reads | 21/21 (100%) | SF1: 3.18M nodes, 17.26M edges |
| SNB Business Intelligence | 20 analytical | 16/16 run (100%) (BI-17+ timeout) | SF1 (same dataset) |
| Graphalytics | 6 algorithms x 2 datasets | 12/12 (100%) | LDBC XS reference graphs |
| FinBench | 12 CR + 6 SR + 3 RW + 19 W | 40/40 (100%) | Synthetic: 7.7K nodes, 42.2K edges |
See docs/ldbc/ for detailed per-query results, latency tables, and analysis.
- HTTP Tenant Management API: Full CRUD for tenants via REST endpoints (
POST /api/tenants,GET /api/tenants,GET /api/tenants/{id},DELETE /api/tenants/{id}). - samyama-mcp-serve: Auto-generate MCP (Model Context Protocol) servers from any graph schema. Discovers labels, edge types, and properties, then generates typed tools for AI agents. Install via
pip install samyama[mcp]and runsamyama-mcp-serve --demofor instant agent tool access. - Snapshot format (.sgsnap): Portable gzip JSON-lines snapshot export/import for graph tenants, enabling backup and migration across instances.
- Cricket dataset loader: Load 21K Cricsheet T20/ODI/Test matches (36K nodes, 1.4M edges) via
cargo run --release --example cricket_loader. - AACT clinical trials loader: Full AACT dataset loader for clinical trial analysis (575K studies, 7.7M nodes, 27M edges).
- Index scan fix: Inline MATCH properties
{prop: val}now trigger IndexScan when a matching index exists, avoiding full label scans.
- OpenCypher Query Engine: ~90% OpenCypher coverage — MATCH, CREATE, DELETE, SET, MERGE, OPTIONAL MATCH, UNION, WITH, UNWIND, aggregations, and 30+ built-in functions.
- RESP Protocol: Drop-in compatibility with any Redis client (
redis-cli, Jedis, ioredis). - Vector Search: Built-in HNSW indexing for millisecond semantic search and Graph RAG.
- NLQ (Natural Language Queries): Ask questions in plain English — the LLM translates to Cypher automatically.
- Graph Algorithms: Native PageRank, BFS, Dijkstra, WCC, SCC, CDLP, LCC, MaxFlow, MST, SSSP, Triangle Counting.
- Optimization Solvers: 15+ metaheuristic algorithms (Jaya, Rao, GWO, PSO, Firefly, Cuckoo, ABC, NSGA-II) for in-database optimization.
- Multi-Tenancy: Tenant-level isolation with per-tenant quotas via RocksDB column families.
- High Availability: Raft consensus (via
openraft) for cluster replication and automatic failover. - Persistence: RocksDB storage with Write-Ahead Log and checkpointing.
- EXPLAIN Queries: Inspect query execution plans without running them.
- HTTP Tenant API: REST endpoints for tenant CRUD (create, list, get, delete) alongside the RESP protocol.
- MCP Server Generation: Auto-generate MCP servers from graph schema for AI agent integration (
samyama-mcp-serve). - Snapshot Export/Import: Portable
.sgsnapformat (gzip JSON-lines) for tenant backup and migration.
git clone https://github.com/samyama-ai/samyama-graph
cd samyama-graph
cargo build --release./target/release/samyamaThis starts the RESP server on port 6379 and the HTTP API on port 8080.
redis-cli -p 6379
# Create nodes
GRAPH.QUERY mygraph "CREATE (n:Person {name: 'Alice', age: 30})"
# Query
GRAPH.QUERY mygraph "MATCH (n:Person) RETURN n"
# Explain a query plan
GRAPH.QUERY mygraph "EXPLAIN MATCH (n:Person) WHERE n.age > 25 RETURN n"Samyama ships with domain-specific demos that showcase the full feature set.
| Example | Command | Description |
|---|---|---|
| Persistence | cargo run --example persistence_demo |
RocksDB persistence, WAL, multi-tenancy, recovery |
| Cluster | cargo run --example cluster_demo |
3-node Raft cluster with leader election and failover |
| Full Benchmark | cargo run --example full_benchmark |
Scale test up to 1M+ nodes |
Each demo builds a domain-specific knowledge graph, runs Cypher queries, executes graph algorithms, and demonstrates natural language querying via the NLQ pipeline.
| Example | Command | What it demonstrates |
|---|---|---|
| Banking / Fraud Detection | cargo run --example banking_demo |
Customer segmentation, fraud patterns, money laundering detection, OFAC screening |
| Clinical Trials | cargo run --example clinical_trials_demo |
Patient-trial matching (vector search), drug interactions (PageRank), site optimization (NSGA-II) |
| Supply Chain | cargo run --example supply_chain_demo |
Disruption analysis, cold-chain monitoring, port optimization (Jaya), alternative suppliers (vector search) |
| Smart Manufacturing | cargo run --example smart_manufacturing_demo |
Digital twin, failure cascade analysis, production scheduling (Cuckoo Search), energy optimization |
| Social Network | cargo run --example social_network_demo |
Follower graphs, mutual connections, influence analysis (PageRank), community detection (WCC) |
| Knowledge Graph | cargo run --example knowledge_graph_demo |
Document lineage, expert finding (vector search), topic clustering, knowledge hub identification |
| Enterprise SOC | cargo run --example enterprise_soc_demo |
Threat intel, MITRE ATT&CK mapping, attack path analysis (Dijkstra), lateral movement simulation |
| Agentic Enrichment | cargo run --example agentic_enrichment_demo |
Generation-Augmented Knowledge (GAK) — LLM generates Cypher to enrich the graph autonomously |
| Example | Command | Description |
|---|---|---|
| LDBC SNB | cargo run --example ldbc_loader |
Load LDBC SNB SF1 dataset (3.18M nodes, 17.26M edges) |
| FinBench | cargo run --example finbench_loader |
Load/generate LDBC FinBench dataset |
| Cricket | cargo run --release --example cricket_loader |
Load 21K Cricsheet matches (36K nodes, 1.4M edges) |
| AACT Clinical Trials | cargo run --release --example aact_loader |
Full AACT dataset (575K studies, 7.7M nodes, 27M edges) |
| Example | Command | Description |
|---|---|---|
| MCP Server | samyama-mcp-serve --demo |
Auto-generate MCP server from graph schema for AI agents (Python, pip install samyama[mcp]) |
~90% OpenCypher coverage. See docs/CYPHER_COMPATIBILITY.md for the full matrix.
MATCH, OPTIONAL MATCH, WHERE, RETURN, RETURN DISTINCT, ORDER BY, SKIP, LIMIT, CREATE, DELETE, DETACH DELETE, SET, REMOVE, MERGE (with ON CREATE SET / ON MATCH SET), WITH, UNWIND, UNION / UNION ALL, EXPLAIN, EXISTS subqueries
| Category | Functions |
|---|---|
| String | toUpper, toLower, trim, replace, substring, left, right, reverse, toString |
| Numeric | abs, ceil, floor, round, sqrt, sign, toInteger, toFloat |
| Aggregation | count, sum, avg, min, max, collect |
| List/Collection | size, length, head, last, tail, keys, range |
| Graph | id, labels, type, exists, coalesce, startsWith, endsWith, contains |
Arithmetic (+, -, *, /, %), comparison (=, <>, <, >, <=, >=), logical (AND, OR, NOT, XOR), string (STARTS WITH, ENDS WITH, CONTAINS, =~), null (IS NULL, IS NOT NULL), list (IN).
Cross-type coercion: Integer/Float promotion and String/Boolean coercion for LLM-generated queries. Null propagation follows Neo4j three-valued logic.
src/
├── graph/ # Property graph model (Node, Edge, PropertyValue, GraphStore)
├── query/ # OpenCypher engine
│ ├── cypher.pest # PEG grammar (Pest)
│ ├── parser.rs # Parser → AST
│ └── executor/ # Volcano iterator model (scan, filter, expand, project, aggregate, sort, limit)
├── protocol/ # RESP3 server (Tokio TCP)
├── persistence/ # RocksDB + WAL + multi-tenancy
├── raft/ # Raft consensus (openraft)
├── nlq/ # Natural Language Query pipeline (OpenAI, Gemini, Ollama, Claude Code)
├── vector/ # HNSW vector index
├── snapshot/ # Portable .sgsnap export/import
└── sharding/ # Tenant-level sharding
Key design decisions are documented as Architecture Decision Records.
- samyama-graph-algorithms: PageRank, BFS, Dijkstra, WCC, SCC, MaxFlow, MST, Triangle Counting
- samyama-optimization: 15+ metaheuristic solvers for single and multi-objective optimization
Run with cargo bench. See docs/performance/ for detailed results.
| Operation | Throughput | Notes |
|---|---|---|
| Node insertion | ~3.4M nodes/sec | At 1K batch, single-threaded |
| Label scan | <1 us | 100-node label groups |
| 1-hop traversal | ~22 us | MATCH-WHERE-RETURN pattern |
| Cypher parse | <8 us | Multi-hop patterns with aggregation |
- LDBC Benchmark Results — SNB Interactive, SNB BI, Graphalytics, FinBench
- Architecture
- Cypher Compatibility
- ACID Guarantees
- Benchmarks
- Architecture Decision Records
- Technology Comparisons
1814 unit tests, integration tests via Python scripts, and 8 domain-specific example demos.
cargo test # Run all tests
cargo bench # Run benchmarks
cargo clippy -- -D warnings # Lint
cargo fmt -- --check # Format checkApache License 2.0
