Memex - Layered Knowledge Graphs

Memex stores knowledge in layers: raw sources + interpreted ontologies. Like git for knowledge graphs - content-addressed, verifiable, with interpretation history.

Live Demo | Website

The Problem

RAG returns similar text chunks. AI agents need:

Access to raw sources (not just interpretations)
Structured relationships between entities
Multiple views of the same data
Verifiable provenance

The Solution

Two-layer architecture:

Source Layer:  Raw data (content-addressed, immutable)
               ↓ extracted_from
Ontology Layer: Entities + Relationships (LLM-interpreted)
               ↓ attention edges
Query Layer:   Dynamic, usage-weighted connections

Quick Start

# Start Neo4j
docker run -d \
  -p 7687:7687 -p 7474:7474 \
  -e NEO4J_AUTH=neo4j/password \
  neo4j:5.15-community

# Build and start server
go build ./cmd/memex-server
./memex-server

# Server runs on http://localhost:8080

API Reference

Node Operations

# Create a node
curl -X POST http://localhost:8080/api/nodes \
  -H "Content-Type: application/json" \
  -d '{"id": "person:john-doe", "type": "Person", "content": "Software engineer", "meta": {"name": "John Doe"}}'

# Get a node
curl http://localhost:8080/api/nodes/person:john-doe

# List nodes (with pagination)
curl "http://localhost:8080/api/nodes?limit=100&offset=0"

# Delete a node
curl -X DELETE http://localhost:8080/api/nodes/person:john-doe

Link Operations

# Create a link
curl -X POST http://localhost:8080/api/links \
  -H "Content-Type: application/json" \
  -d '{"source": "person:john-doe", "target": "company:acme", "type": "WORKS_AT"}'

# Get links for a node
curl http://localhost:8080/api/nodes/person:john-doe/links

Query Operations

# Search by text
curl "http://localhost:8080/api/query/search?q=john&limit=10"

# Filter by type
curl "http://localhost:8080/api/query/filter?type=Person&limit=100"

# Graph traversal
curl "http://localhost:8080/api/query/traverse?start=person:john-doe&depth=2"

# Get subgraph
curl "http://localhost:8080/api/query/subgraph?node_id=person:john-doe&depth=2"

# Attention-weighted subgraph
curl "http://localhost:8080/api/query/attention_subgraph?node_id=person:john-doe&min_weight=0.5"

Attention Edges

# Update attention edge (co-occurrence/relevance)
curl -X POST http://localhost:8080/api/edges/attention \
  -H "Content-Type: application/json" \
  -d '{"source": "entity1", "target": "entity2", "query_id": "q123", "weight": 0.8}'

# Prune low-weight edges
curl -X POST http://localhost:8080/api/edges/attention/prune \
  -H "Content-Type: application/json" \
  -d '{"threshold": 0.1}'

Graph Overview

# Get graph statistics and type distribution
curl http://localhost:8080/api/graph/map

LLM Ingestion

The bench/ directory contains tools for LLM-powered knowledge extraction:

cd bench
pip install -r requirements.txt

# Set your API key
export OPENAI_API_KEY=your-key

# Ingest with parallel workers
python ingest_ai.py --limit 1000 --concurrency 5

The ingestion pipeline:

Takes raw text documents
Uses LLM to extract entities and relationships
Creates content-addressed source nodes
Links entities to sources with EXTRACTED_FROM edges
Updates attention edges for co-occurring entities

MCP Server (AI Agent Integration)

Memex includes an MCP (Model Context Protocol) server for AI agents:

cd mcp-server
pip install -r requirements.txt
python server.py

Provides tools for:

search_graph - Search entities by name
get_node - Retrieve node details
get_relationships - Explore entity connections
traverse_graph - Multi-hop traversal

Benchmarking

HotpotQA benchmark suite for evaluating retrieval:

cd bench

# Agent-based retrieval
python benchmark_kg_agent.py --limit 100

# Baseline RAG comparison
python baseline_rag.py --limit 100

Architecture

memex (CLI) ─┐
             │
HTTP API ────┼──→ memex-server (Go) ──→ Neo4j
             │
MCP Server ──┘

Components:

cmd/memex-server - Go HTTP API server
cmd/memex - CLI tool
mcp-server/ - Python MCP server for AI agents
bench/ - Ingestion pipeline and benchmarks
internal/server/ - Server implementation

Why Memex?

vs RAG/Vector DBs:

Access to raw sources, not just chunks
Structured relationships, not just similarity
Multiple interpretations of same data

vs Traditional Graph DBs:

LLM extracts entities automatically
Content-addressed sources
Attention edges for query-time relevance

Documentation

Architecture - Design details
MCP Server - AI agent integration
Benchmarks - Evaluation tools

License

BSD 3-Clause License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
.github/workflows		.github/workflows
bench		bench
cmd		cmd
docs		docs
examples		examples
extractor		extractor
internal		internal
mcp-server		mcp-server
pkg		pkg
test		test
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
ARCHITECTURE_PIVOT.md		ARCHITECTURE_PIVOT.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Memex - Layered Knowledge Graphs

The Problem

The Solution

Quick Start

API Reference

Node Operations

Link Operations

Query Operations

Attention Edges

Graph Overview

LLM Ingestion

MCP Server (AI Agent Integration)

Benchmarking

Architecture

Why Memex?

Documentation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

systemshift/memex

Folders and files

Latest commit

History

Repository files navigation

Memex - Layered Knowledge Graphs

The Problem

The Solution

Quick Start

API Reference

Node Operations

Link Operations

Query Operations

Attention Edges

Graph Overview

LLM Ingestion

MCP Server (AI Agent Integration)

Benchmarking

Architecture

Why Memex?

Documentation

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages