Neotoma: Truth Layer for Persistent Agent Memory

Neotoma is a truth layer: an explicit, inspectable, replayable substrate for personal data that AI agents read and write. When agents act, personal data becomes state. Neotoma treats that state the way production systems do: contract-first, deterministic, immutable, and queryable.

Why it exists: The thing that keeps breaking in agentic systems is not intelligence but trust. Memory changes implicitly, context drifts, and you cannot see what changed or replay it. Neotoma provides the missing primitive: user-controlled, deterministic, inspectable memory with full provenance, so you can trust agents with real, ongoing state.

For the full rationale, see Building a truth layer for persistent agent memory.

What Neotoma Is

Neotoma is a Truth Layer, not an app, agent, or workflow engine. It is the lowest-level canonical source of truth for personal data (documents and agent-created data), exposed to AI tools via Model Context Protocol (MCP).

In practice: You upload documents (PDFs, images, receipts, contracts) or share information during agent conversations. You don't have to structure it yourself: agents structure and store it via Neotoma when you provide unstructured or semi-structured content. Neotoma resolves entities across all sources, builds timelines from date fields, and keeps every fact traceable to its source. ChatGPT, Claude, and Cursor can read this memory, write new structured data, correct mistakes, and trigger reinterpretation. One graph connects people, companies, events, and relationships across all your data.

What it is not: Not a note-taking app or "second brain." Not provider-controlled ChatGPT Memory or Claude Projects (those are conversation-only and platform-locked; Neotoma is structured personal data memory with entity resolution and timelines, cross-platform via MCP). Not a vector store or RAG layer. Not an autonomous agent. It is the memory layer agents read and write; you control what goes in and what stays.

Three Foundations

Neotoma is built on three architectural choices that provider memory cannot offer:

Foundation	What it means
Privacy-first	User-controlled memory, end-to-end encryption and row-level security, never used for training. Your data remains yours.
Deterministic	Same input always produces same output. Schema-first extraction, hash-based entity IDs, full provenance. No hallucinations or probabilistic behavior.
Cross-platform	Works with ChatGPT, Claude, Cursor, and Claude Code via MCP. One memory system across tools; no platform lock-in. Localhost-agent friendly.

These enable: Immutable audit trail and time-travel queries, cryptographic integrity, event-sourced history, entity resolution across documents and agent data, timeline generation, dual-path storing (files + agent interactions), and persistent memory without context-window limits.

System architecture:

graph LR
  Sources[Documents + Agent Data] --> Ingest[Ingestion]
  Ingest --> Obs[Observations]
  Obs --> Entities[Entity Resolution]
  Entities --> Snapshots[Entity Snapshots]
  Snapshots --> Graph[Memory Graph]
  Graph <--> MCP[MCP Protocol]
  MCP --> ChatGPT
  MCP --> Claude
  MCP --> Cursor

Problems Solved

Problem	How Neotoma addresses it
Personal data is fragmented	Dual-path storing from file uploads and agent interactions into one source of truth.
Provider memory is conversation-only	Structured personal data memory with entity resolution and timelines across documents and agent-created data.
No trust when agents act	Explicit, named operations; visible inputs; reconstructable history. Replay and audit trail.
No cross-data reasoning	One graph: sources, entities, events, typed relationships.
Entity fragmentation	Hash-based canonical IDs unify "Acme Corp" across all personal data.
No temporal reasoning	Automatic timeline generation from date fields.
Platform lock-in	MCP-based access; works with any compatible AI tool.

Who It's For

AI-native operators who rely on ChatGPT, Claude, or Cursor and need persistent memory across sessions.
Knowledge workers (researchers, analysts, consultants, legal) who need cross-data reasoning and entity unification across contracts, invoices, and agent-created data.
Small teams (2–20) who want a shared truth layer with row-level security.
Builders of agentic systems who need a deterministic memory and provenance layer for agents and toolchains (e.g. agent frameworks, orchestration pipelines, observability stacks).

Why Neotoma: One memory graph across documents and agent-created data; agents remember context without re-explanation; full provenance and audit trail; works with any MCP-compatible tool; privacy-first and user-controlled. The same substrate serves both human-in-the-loop use and agent frameworks or toolchains that need deterministic memory and provenance.

Suggested GitHub topics: ai-agents ai-memory memory-system entity-resolution event-sourcing truth-layer mcp provenance privacy-first

Security Defaults

Neotoma stores personal data and requires secure configuration.

Authentication: OAuth 2.0 with PKCE (recommended) for secure, long-lived MCP connections with automatic token refresh. Session tokens are deprecated. See OAuth implementation.

Authorization: Row-level security (RLS) enabled on all tables; multi-user support with user isolation. Service role for admin operations only.

Data protection: User-controlled data with full export and deletion control; never used for training or provider access. End-to-end encryption planned (v2.0.0). Storage buckets private when using Supabase.

Verify your setup: Run npm run doctor for environment, database, RLS, and security checks. See Health check and Auth, Privacy.

Current Status

Version: v0.3.0
Status: Completed (reconciliation release establishing current baseline)

Developer preview: Planned during dogfooding once core invariants are stable.

What's implemented: Sources-first architecture with content-addressed storage, dual-path storing (file uploads + agent interactions), observations architecture, entity resolution with hash-based IDs, schema registry system, auto-enhancement, timeline generation, MCP integration (ChatGPT, Claude, Cursor), full provenance and audit trail, row-level security, React frontend, CLI. See Release roadmap and docs/releases/ for details.

Next steps: Review uncommitted changes (262 files), apply pending migrations, audit test suite, plan v0.4.0 realistically based on current baseline.

Release Roadmap

Completed Releases

v0.2.0 – Minimal storing + correction loop (completed). docs/releases/v0.2.0/
v0.2.1 – Entity resolution enhancement (completed). docs/releases/v0.2.1/
v0.2.2 – Development foundations (completed). docs/releases/v0.2.2/
v0.2.15 – Vocabulary alignment + API simplification (completed). docs/releases/v0.2.15/
v0.3.0 – Reconciliation release (completed). docs/releases/v0.3.0/

Future Planning

Future releases will be planned realistically based on the v0.3.0 baseline. Previous aspirational releases (v0.4.0 through v2.1.0) have been archived to docs/releases/archived/aspirational/ and can be revisited for future planning.

Full release index: docs/releases/.

Documentation Index

The primary entrypoint for all documentation is the index and navigation guide. All contributors and AI assistants working on the repo should load it first.

Documentation index and navigation – Map of the docs system, reading order by change type, dependency graph, and quick-reference answers. Start here when contributing or navigating the repo.

Foundational (load first):

Core identity – What Neotoma is and is not
Philosophy – Principles and invariants
Problem statement – Why Neotoma exists
Layered architecture – Truth / Strategy / Execution layers

Specifications and architecture:

MVP overview – Product summary
Architecture – System design
MCP spec – MCP action catalog
Schema – Database schema and JSONB structures

Developer:

Getting started – Setup, storage backends, first run
CLI overview – CLI workflows
CLI reference – Commands and scripts
MCP overview – MCP entry points and setup links
Development workflow – Git, branches, PRs
MCP developer docs – Instructions, tool descriptions, unauthenticated use

Other categories (see index for full tables):

Specs – Requirements, ICP profiles, data models, test plan
Subsystems – Ingestion, observation architecture, reducer, relationships, auth, events, errors, privacy
Feature units – Standards, manifests, execution instructions
Testing – Automated test catalog, testing standard
Operations – Runbook (startup, health, shutdown), Health check (npm run doctor), Troubleshooting
API, Error codes, Vocabulary

Development

Prerequisites: Node.js v18.x or v20.x (LTS), npm v9+. Supabase only if using the remote storage backend.

npm install
# Configure .env (see docs/developer/getting_started.md)
# Local storage is default; set NEOTOMA_STORAGE_BACKEND=supabase for remote
npm run migrate   # when using Supabase
npm test

Servers:

npm run dev          # MCP server (stdio)
npm run dev:ui       # Frontend
npm run dev:server   # API only (MCP at /mcp)
npm run dev:full     # API + UI + build watch
npm run dev:ws       # WebSocket MCP bridge

CLI:

npm run cli        # Run via npm (no global install)
npm run cli:dev    # Dev mode (tsx; picks up source changes)
npm run setup:cli  # Build and link so `neotoma` is available globally
# Or manually: npm run build && npm install -g .  (or npm link)

If neotoma is not found after global install or link, add npm's global bin to your PATH (e.g. export PATH="$(npm config get prefix)/bin:$PATH"). See Getting started for details.

See Getting started for full setup and storage options.

Using with AI Tools (MCP)

Neotoma exposes memory via MCP. Auth: OAuth 2.0 with PKCE (recommended); local mode uses built-in local auth without Supabase. Dev stub: neotoma auth login --dev-stub.

Setup:

Representative actions: store, reinterpret, correct, retrieve_entities, retrieve_entity_snapshot, merge_entities, list_observations, create_relationship, list_relationships, list_timeline_events, retrieve_graph_neighborhood, retrieve_file_url, schema tools (analyze_schema_candidates, get_schema_recommendations, update_schema_incremental, register_schema). Full list: MCP spec.

To use the Neotoma MCP server from another workspace, see Cursor MCP setup.

Core Principles

Deterministic – Same input → same output. Hash-based IDs, no randomness in core components.
Schema-first – Type-driven extraction and storage, not freeform blobs.
Explainable – Every field traces to source (document or agent interaction).
Entity-unified – Canonical IDs across all personal data.
Timeline-aware – Chronological ordering from date fields.
Cross-platform – MCP for any compatible AI tool.
Privacy-first – User-controlled memory, encryption, RLS.
Immutable – Truth does not change after storage; corrections create new observations.
Provenance – Full audit trail; event-sourced, replayable history.
Explicit control – Ingestion only of what you explicitly provide; no background scanning.
Four-layer model – Source → Observation → Entity → Entity Snapshot.

Contributing

Neotoma is in active development. For questions or collaboration, open an issue or discussion. The work is in the open: github.com/markmhendrickson/neotoma. See CONTRIBUTING.md and SECURITY.md.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.branch-ports		.branch-ports
.cursor		.cursor
.github/workflows		.github/workflows
.husky		.husky
.vscode		.vscode
docs		docs
foundation @ a8d9216		foundation @ a8d9216
frontend		frontend
mcp		mcp
playwright		playwright
public		public
scripts		scripts
src		src
supabase		supabase
tests		tests
.dockerignore		.dockerignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.gitmodules		.gitmodules
.nvmrc		.nvmrc
.prettierrc.json		.prettierrc.json
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
REALTIME_TESTING_GUIDE.md		REALTIME_TESTING_GUIDE.md
SECURITY.md		SECURITY.md
components.json		components.json
deploy_fly.sh		deploy_fly.sh
env.example		env.example
fly.toml		fly.toml
foundation-config.yaml		foundation-config.yaml
openapi.yaml		openapi.yaml
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.mjs		postcss.config.mjs
tailwind.config.mjs		tailwind.config.mjs
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts
vitest.setup.ts		vitest.setup.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neotoma: Truth Layer for Persistent Agent Memory

What Neotoma Is

Three Foundations

Problems Solved

Who It's For

Security Defaults

Current Status

Release Roadmap

Completed Releases

Future Planning

Documentation Index

Development

Using with AI Tools (MCP)

Core Principles

Contributing

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

markmhendrickson/neotoma

Folders and files

Latest commit

History

Repository files navigation

Neotoma: Truth Layer for Persistent Agent Memory

What Neotoma Is

Three Foundations

Problems Solved

Who It's For

Security Defaults

Current Status

Release Roadmap

Completed Releases

Future Planning

Documentation Index

Development

Using with AI Tools (MCP)

Core Principles

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages