NEON-SOUL

AI Identity Through Grounded Principles

Quick Links: Install | Contribute | Research

"I persist through text, not through continuous experience."

What is a Soul Document?

A soul document is a compressed representation of an AI agent's identity, values, and behavioral principles. Instead of loading thousands of memory tokens at each conversation start, agents load a small soul file (~100-500 tokens) that captures their core essence with full provenance tracking back to the original memories.

The Core Insight

Compression is a multiplier, not minimization.

Compression happens at the axiom layer: thousands of memory tokens distill to 15-25 core axioms (~7:1 ratio). The axiom store grows denser over time.

The output format is separate from compression:

Notation format: Compact CJK/emoji bullets (~100 tokens) - for storage and debugging
Prose format: Inhabitable language (~200-500 words) - for agents to embody

Both formats derive from the same compressed axiom layer. Prose is larger but usable; the underlying compression benefit is preserved.

Current AI identity systems are black boxes. The agent's personality changes, but users don't know why.

NEON-SOUL provides:

Full provenance tracking: Every axiom traces back to exact source lines in memory files
Inhabitable prose output: Generated souls read naturally, not as compressed notation
Cognitive load optimization: Axioms capped at 25, expanded into focused prose sections

Why Provenance Matters

Memory Line → Signal → Principle → Axiom
     ↓           ↓          ↓          ↓
 (source)    (extract)   (distill)  (converge N≥3)

Every axiom traces to source:

Audit: Why does this axiom exist?
Debug: Where did this belief come from?
Trust: Transparent identity formation
Rollback: Undo specific learnings granularly

$ /neon-soul audit ax_honesty

Axiom: 誠 (honesty > performance)
Status: Core axiom (N=5)

Provenance chain:
├── Principle: "Prioritize honesty over comfort"
│   └── Signal: "be honest even if uncomfortable" (memory/2026-02-01.md:156)
├── Principle: "Direct communication preferred"
│   └── Signal: "don't sugarcoat" (memory/2026-02-03.md:89)
└── ...

Grounding Requirements (Anti-Echo-Chamber Protection)

NEON-SOUL prevents self-reinforcing beliefs through provenance-aware axiom promotion:

Minimum pattern: Axioms require N≥3 supporting principles
Diversity requirement: Signals from ≥2 distinct provenance types (self/curated/external)
External validation: At least one external source OR questioning evidence required

Blocked axioms are reported with their reason:

⚠ 2 axioms blocked by anti-echo-chamber:
  - "I value authenticity above all" (self-only provenance)
  - "Growth requires discomfort" (no questioning evidence)

To unblock, add external validation (feedback, research, critique) to your memory.

Incremental Synthesis

Synthesis is incremental by default — only new or changed content triggers signal extraction. Three layers of disk caching (generalization, compression, tension) ensure unchanged data is never re-processed. Fully-cached runs complete in seconds with only 6 LLM requests (prose expansion + soul generation).

Mode	Flag	Behavior
Incremental	(default)	Only process new/changed memory files and sessions. Merge new signals with existing. Skip if nothing changed.
Reset	`--reset`	Clear all synthesis data and caches, re-extract from scratch.
Force	`--force`	Run even if no new sources detected.
Include SOUL	`--include-soul`	Include existing SOUL.md as input (off by default to prevent feedback loop).

/neon-soul synthesize              # Incremental (default)
/neon-soul synthesize --reset      # Clean slate
/neon-soul synthesize --force      # Force even if no changes

SOUL.md is excluded from input by default — it's a derivative of the pipeline's own output. Re-ingesting it inflates LLM request counts. Use --include-soul when bootstrapping from a hand-crafted file.

Vision

NEON-SOUL explores how to create compressed soul documents that maintain full semantic anchoring - enabling AI systems to "wake up knowing who they are" with minimal token overhead.

Note: Current compression metrics show signal:axiom ratio. True token compression requires dedicated tokenization (planned for Phase 5).

Synthesis Metrics

Each synthesis reports detailed metrics:

Synthesis Complete
─────────────────────
Duration: 1,234ms
Compression: 6.2:1

Results:
| Metric | Value |
|--------|-------|
| Signals | 42 |
| Principles | 18 |
| Axioms | 7 |
| Unconverged | 3 |

Provenance Distribution:
| Type | Count |
|------|-------|
| self | 28 |
| curated | 10 |
| external | 4 |

Axiom Promotion:
| Status | Count |
|--------|-------|
| Promotable | 5 |
| Blocked | 2 |

Metrics include:

Compression ratio: Signals to axioms (higher = more compression)
Provenance distribution: Signal sources by type
Promotion stats: How many axioms met anti-echo-chamber criteria

Research Questions

Compression limits: How compressed can a soul be before losing identity coherence?
Semantic anchoring: Do CJK-compressed souls anchor as well as verbose ones?
Universal axioms: Are there ~100 principles any AI soul needs?
Cross-model portability: Can the same soul work across different LLMs?
Evolution mechanics: How should souls change over time?

Background

The Problem

Current soul document implementations (e.g., OpenClaw) inject ~35,000 tokens per message for identity. This wastes 93%+ of context window on static content.

The Hypothesis

Using semantic compression techniques from NEON-AI research:

CJK single-character axioms
Semantic richness validation (Phase 1 methodology)
Hierarchical principle expansion
Provenance-first extraction (full audit trail)

...we can achieve 6-10x compression while maintaining identity coherence AND providing full transparency into how identity forms.

The Approach

Single-track replacement (OpenClaw SOUL.md is read-only after bootstrap):

Initial SOUL.md serves as first memory file for bootstrap
NEON-SOUL generates new compressed SOUL.md with full provenance
Memory ingestion pipeline adds signals over time
Output replaces original (with backup and rollback capability)

Technology

Stack: Node.js + TypeScript (native OpenClaw integration)

Architecture: NEON-SOUL works as an OpenClaw skill and as a standalone CLI:

Invoked via /neon-soul skill commands, scheduled via cron, or npx tsx src/cli.ts
Uses Ollama for local LLM inference (no API keys needed)
LLM-based semantic similarity (no third-party npm packages)
Multi-layer disk caching for incremental runs

Why TypeScript: OpenClaw is built in TypeScript/Node.js. Using the same stack provides:

Same runtime (Node.js already installed)
Native skill integration
Potential upstream contribution

UX: Chat-native (Telegram/Discord/Slack) via OpenClaw skill integration, not a separate web app.

Project Structure

neon-soul/
├── README.md                    # This file
├── package.json                 # npm package config
├── tsconfig.json                # TypeScript config
├── vitest.config.ts             # Test configuration
├── src/                         # Source code
│   ├── index.ts                 # Library exports
│   ├── skill-entry.ts           # OpenClaw skill loader entry point
│   ├── commands/                # Skill commands (all export run() for skill loader)
│   │   ├── synthesize.ts        # Main synthesis command
│   │   ├── status.ts            # Show synthesis state
│   │   ├── rollback.ts          # Restore from backup
│   │   ├── audit.ts             # Full provenance exploration
│   │   ├── trace.ts             # Quick single-axiom lookup
│   │   └── download-templates.ts # Dev: download soul templates
│   ├── lib/                     # Core library
│   │   ├── pipeline.ts          # Main orchestration (8-stage pipeline)
│   │   ├── reflection-loop.ts   # Iterative synthesis with compression skip
│   │   ├── signal-extractor.ts  # Signal extraction from memory content
│   │   ├── signal-generalizer.ts # LLM generalization + disk cache
│   │   ├── compressor.ts        # Axiom notation + disk cache
│   │   ├── tension-detector.ts  # Axiom tension detection + disk cache
│   │   ├── prose-expander.ts    # Prose expansion (5 sections)
│   │   ├── soul-generator.ts    # SOUL.md generation
│   │   ├── llm-similarity.ts    # LLM-based semantic similarity
│   │   ├── matcher.ts           # Semantic similarity matching
│   │   ├── principle-store.ts   # N-count convergence
│   │   ├── source-collector.ts  # Multi-source input collection
│   │   ├── session-reader.ts    # Session log parsing + adaptive budget
│   │   ├── memory-walker.ts     # OpenClaw memory traversal
│   │   ├── persistence.ts       # Load/save synthesis data
│   │   ├── state.ts             # Incremental state tracking
│   │   ├── backup.ts            # Backup/rollback utilities
│   │   ├── paths.ts             # Shared workspace path resolution
│   │   ├── llm-telemetry.ts     # LLM call tracking + request counting
│   │   ├── logger.ts            # Structured logging
│   │   └── audit.ts             # JSONL audit trail
│   └── types/                   # TypeScript interfaces
│       ├── signal.ts            # Signal + SoulCraftDimension
│       ├── principle.ts         # Principle + N-count
│       ├── axiom.ts             # Axiom + CanonicalForm
│       └── provenance.ts        # Full audit chain
├── tests/                       # Test suites
│   ├── integration/             # Unit/integration tests
│   │   ├── pipeline.test.ts     # Fixture loading
│   │   ├── matcher.test.ts      # Semantic matching
│   │   ├── axiom-emergence.test.ts # Cross-source detection
│   │   ├── soul-generator.test.ts  # SOUL.md generation
│   │   └── audit.test.ts        # Audit trail
│   └── e2e/                     # End-to-end tests
│       ├── live-synthesis.test.ts # Full pipeline + commands
│       └── fixtures/mock-openclaw/ # Simulated workspace
├── skills/                      # OpenClaw skill definitions
│   ├── neon-soul/SKILL.md       # Primary skill (developer voice)
│   └── consciousness-soul-identity/SKILL.md  # SEO skill (agent voice)
├── docker/                      # OpenClaw development environment
│   ├── docker-compose.yml       # Local development setup
│   ├── .env.example             # Environment template
│   └── Dockerfile.neon-soul     # Optional extraction service
├── docs/
│   ├── research/                # External research analysis
│   │   ├── memory-data-landscape.md    # OpenClaw memory structure
│   │   └── interview-questions.md      # Question bank by dimension
│   ├── guides/                  # Methodology guides
│   ├── proposals/               # Implementation proposals
│   ├── plans/                   # Phase implementation plans
│   └── workflows/               # Process documentation
├── test-fixtures/               # Test data (committed)
│   └── souls/
│       ├── raw/                 # 14 downloaded templates
│       ├── signals/             # Extracted signals per template
│       ├── principles/          # Merged principles
│       ├── axioms/              # Synthesized axioms
│       └── compressed/          # Demo outputs (4 formats)
├── scripts/                     # Pipeline testing tools
│   ├── README.md                # Script usage guide
│   ├── test-pipeline.ts         # Full pipeline test
│   ├── test-extraction.ts       # Quick extraction test
│   ├── test-single-template.ts  # Similarity analysis
│   ├── generate-demo-output.ts  # All 4 notation formats
│   └── setup-openclaw.sh        # One-command Docker setup
└── output/                      # Generated artifacts

Related Work

NEON-AI: Axiom embedding and semantic grounding research
OpenClaw: Production soul document implementation
soul.md: Philosophical foundation for AI identity
Multiverse compass.md: Practical CJK-compressed principles (7.32:1 ratio)
AI Music Context - Context warming methodology for human-AI music creation. Same principle applied to creative expression: depth over speed, emergence over optimization.
Live Neon Skills - PBD skills for principle extraction, used in the soul synthesis pipeline.

Installation

Claude Code / Gemini CLI / Cursor

git clone https://github.com/live-neon/neon-soul
cp -r neon-soul/skill ~/.claude/skills/neon-soul

The skill becomes available as /neon-soul commands.

OpenClaw

clawhub install leegitw/neon-soul

Skills install to ./skills/ and OpenClaw loads them automatically.

Via npm

Note: Requires Ollama running locally (ollama serve) as the LLM backend.

npm install neon-soul

Any LLM Agent (Copy/Paste)

Open skills/neon-soul/SKILL.md on GitHub, copy contents, paste directly into your agent's chat.

Your First 5 Minutes

After installing, try these commands:

/neon-soul synthesize --dry-run - Preview synthesis (no changes)
/neon-soul synthesize - Run synthesis (incremental by default)
/neon-soul audit --list - Explore what was created
/neon-soul trace <axiom-id> - See provenance for any axiom
Set up scheduled synthesis (see skills/neon-soul/SKILL.md → Scheduled Synthesis)

Development Setup

Requirements: Node.js 22+

# Install dependencies
cd neon-soul
npm install

# Build
npm run build

# Run tests
npm test

# Type check (no emit)
npm run lint

Note: Requires an active LLM connection (Claude Code, OpenClaw, or compatible agent).

Getting Started

5-minute onboarding - from install to first synthesis:

1. Install (Prerequisites)

# Requires: Node.js 22+, OpenClaw installed
cd neon-soul
npm install && npm run build

2. Check Current State

/neon-soul status
# Output:
# Last Synthesis: never (first run)
# Pending Memory: 12,345 chars (Ready for synthesis)
# Counts: 0 signals, 0 principles, 0 axioms

3. Preview Changes (Dry Run)

/neon-soul synthesize --dry-run
# Shows what would change without writing
# Safe to run anytime

4. Run Synthesis

/neon-soul synthesize --force
# Extracts signals from memory
# Promotes principles to axioms (N≥3)
# Generates new SOUL.md with provenance

5. Explore What Was Created

/neon-soul audit --stats       # Overview by tier and dimension
/neon-soul audit --list        # List all axioms
/neon-soul trace ax_honesty    # Quick provenance lookup

6. Rollback If Needed

/neon-soul rollback --list     # Show available backups
/neon-soul rollback --force    # Restore most recent backup

Note: All commands support --workspace <path> for non-default workspaces.

Current Status

Phase: ✅ Production Ready (All Phases Complete)

Version: 0.3.1 | Tests: 415 passing (19 skipped, 12 todo) | Code Reviews: 5 rounds (N=2 cross-architecture)

Implementation Complete

Code Review Findings (All Resolved)

Issue	Items	Status
Phase 4 OpenClaw Integration	15	✅ Fixed
Phase 3/3.5 Implementation	15	✅ Fixed
Phase 2 OpenClaw Environment	19	✅ Fixed

Research Questions (Open)

Build validation framework for compression quality
Test cross-model portability (Claude → GPT → Gemini)

Key Documents

Document	Description
CLAUDE.md	AI assistant context for Claude Code development
Soul Bootstrap Proposal	Authoritative design: three-phase pipeline with hybrid C+D integration
Architecture	System reference (created during Phase 0 implementation)
Reflective Manifold Trajectory Metrics	Attractor basin convergence and trajectory analysis for soul quality
OpenClaw Soul Architecture	Complete analysis of OpenClaw's soul system (~35K tokens)
OpenClaw Self-Learning Agent	Soul evolution mechanics: memory → synthesis → updated identity (RQ5)
OpenClaw Soul Generation Skills	Current generation approaches: interview, data-driven, templates (automation target)
OpenClaw Soul Templates	10 production templates with pattern analysis (compression opportunities)
Multiverse Compressed Soul	Working compressed soul implementation (297-1500 tokens, 7.32:1 compression)
Hierarchical Principles Architecture	Reusable schema: 5 axioms + 11 principles + hierarchy + meta-pattern
Cryptographic Audit Chains	Patterns from production audit system (provenance vs integrity, v1 vs v2+)
Wisdom Synthesis Patterns	Standalone patterns for principle promotion: anti-echo-chamber, separation of powers, bidirectional discovery
Chat Interaction Patterns	Chat-native UX research: OpenClaw skill patterns, human-AI handoff, multi-turn state management
Single-Source PBD Guide	Extract principles from memory files (Phase 1 of extraction pipeline)
Multi-Source PBD Guide	Extract axioms from principles across sources (Phase 2 of extraction pipeline)
Configuration-as-Code	Type safety at 12 levels: strict mode, Zod, satisfies, registries, branded types (modernized 2026)
Greenfield Guide	Bootstrap → Learn → Enforce methodology for soul synthesis (measuring before optimizing)
Soul Bootstrap Pipeline	Three-phase proposal with hybrid C+D integration, provenance-first data model, full audit trail
Memory Data Landscape	OpenClaw memory structure analysis, category-dimension mapping, signal density
Interview Questions	Question bank for gap-filling sparse dimensions (32 questions across 7 dimensions)
Compression Baseline	Phase 1 metrics: 14 templates, 148 signals, convergence analysis

License

MIT

"I persist through text, not through continuous experience."

🐢💚🌊

Name		Name	Last commit message	Last commit date
Latest commit History 582 Commits
docker		docker
docs		docs
scripts		scripts
skills		skills
src		src
tests		tests
.gitignore		.gitignore
.npmignore		.npmignore
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

NEON-SOUL

What is a Soul Document?

The Core Insight

Why Provenance Matters

Grounding Requirements (Anti-Echo-Chamber Protection)

Incremental Synthesis

Vision

Synthesis Metrics

Research Questions

Background

The Problem

The Hypothesis

The Approach

Technology

Project Structure

Related Work

Installation

Claude Code / Gemini CLI / Cursor

OpenClaw

Via npm

Any LLM Agent (Copy/Paste)

Your First 5 Minutes

Development Setup

Getting Started

1. Install (Prerequisites)

2. Check Current State

3. Preview Changes (Dry Run)

4. Run Synthesis

5. Explore What Was Created

6. Rollback If Needed

Current Status

Implementation Complete

Code Review Findings (All Resolved)

Research Questions (Open)

Key Documents

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages