ABES - Adaptive Belief Ecology System

A research platform for belief ecology: treating beliefs as living, evolving entities rather than static memory entries.

Overview

ABES is an experimental cognitive memory architecture for AI systems. Most memory systems use key-value stores or vector retrieval. This one is different. Beliefs here are first-class objects that decay over time, accumulate tension when they contradict each other, get reinforced when similar evidence shows up, and mutate or get deprecated when tension gets too high.

A pipeline of specialized agents processes beliefs each iteration. There's also an optional RL layer to tune system parameters automatically.

This is a research prototype. It works, but it's not production-ready.

The Chatbot

ABES includes a conversational chatbot that demonstrates the belief ecology in action. The chatbot supports multiple LLM backends (local Ollama or cloud providers) and uses your stored beliefs to provide personalized responses.

Why the chatbot exists

The chat interface is the primary way to interact with and test the belief ecology. When you talk to it:

Your messages get parsed by the perception agent to extract belief candidates
New beliefs are created with initial confidence scores
Existing similar beliefs get reinforced (confidence boost)
Contradicting beliefs accumulate tension
The LLM generates responses using your belief context

This lets you watch the ecology evolve in real time. Tell it a fact a few times and watch the confidence climb. Then contradict yourself and watch the tension spike.

How to use it

Start the backend: PYTHONPATH=$PWD uvicorn backend.api.app:app --port 8000
Start the frontend: cd frontend && npm run dev
Start Ollama: ollama serve (if using local LLM)
Open http://localhost:3000/chat

Try these interactions:

"The project deadline is next Friday"
"The budget is $50,000 for Q1"
"What do you know about the project?"
"Actually the deadline is next Monday" (creates tension with previous belief)

The Activity panel on the right shows belief events as they happen.

Key Features

Feature	Source	Tests
Belief data model (confidence, tension, status, lineage)	backend/core/models/belief.py	test_bel_loop.py
14-phase agent scheduler	backend/agents/scheduler.py	test_scheduler.py
Perception agent (text to belief candidates)	backend/agents/perception.py	test_perception.py
Reinforcement agent (boost on similar evidence)	backend/agents/reinforcement.py	test_reinforcement.py
Decay controller (time-based confidence reduction)	backend/agents/decay_controller.py	test_decay_controller.py
Contradiction auditor (semantic rules + embedding gate)	backend/agents/contradiction_auditor.py	test_contradiction_auditor.py
Mutation engineer (conflict-triggered belief modification)	backend/agents/mutation_engineer.py	test_mutation_engineer.py
Semantic clustering	backend/core/bel/clustering.py	test_clustering.py
RL environment (15D state, 7D action)	backend/rl/environment.py	test_environment.py
Evolution Strategy trainer	backend/rl/training.py	test_training.py
FastAPI REST + WebSocket API	backend/api/app.py	test_routes.py
Chat service with Ollama LLM	backend/chat/service.py	Manual testing
Next.js frontend	frontend/	Manual testing

Architecture

Frontend (Next.js) --> REST/WebSocket --> FastAPI Backend (:8000)
                                              |
                    +-----------+-------------+-----------+
                    |           |             |           |
               Chat Service  Agent Scheduler  RL Environment
               (Ollama LLM)  (14 phases)      (Gymnasium)
                    |           |             |
                    +-----------+-------------+
                                |
                       In-Memory Belief Store

Agent Pipeline

Perception --> Creation --> Reinforcement --> Decay --> Contradiction -->
Mutation --> Resolution --> Relevance --> RL Policy --> Consistency -->
Safety --> Baseline --> Narrative --> Experiment

Each agent is independently tested. See backend/agents/.

Installation

Requirements: Python 3.10+, Node.js 18+ (frontend), Ollama (chat)

Backend

git clone https://github.com/moonrunnerkc/adaptive-belief-ecology-system.git
cd adaptive-belief-ecology-system

python -m venv .venv
source .venv/bin/activate
pip install numpy pydantic pydantic-settings msgpack sentence-transformers httpx
pip install pytest pytest-asyncio

export PYTHONPATH=$PWD

Frontend

cd frontend
npm install

Ollama

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b-instruct-q4_0

Quick Start

Terminal 1 (Backend):

source .venv/bin/activate
PYTHONPATH=$PWD uvicorn backend.api.app:app --host 0.0.0.0 --port 8000

Terminal 2 (Frontend):

cd frontend && npm run dev

Terminal 3 (Ollama):

ollama serve

Open http://localhost:3000/chat

Configuration

All parameters are set via environment variables or backend/core/config.py.

Core Settings

Parameter	Default	Description
`STORAGE_BACKEND`	memory	`memory` or `sqlite` for persistence
`DATABASE_URL`	sqlite+aiosqlite:///./data/abes.db	SQLite database path
`DECAY_PROFILE`	moderate	Presets: `aggressive`, `moderate`, `conservative`, `persistent`
`DECAY_RATE`	0.995	Per-hour confidence multiplier (overridden by profile)
`EMBEDDING_MODEL`	all-MiniLM-L6-v2	Sentence transformer model

LLM Settings

Parameter	Default	Description
`LLM_PROVIDER`	ollama	Provider: `ollama`, `openai`, `anthropic`, `hybrid`, `none`
`LLM_FALLBACK_ENABLED`	true	Fall back to raw beliefs if LLM fails
`OLLAMA_MODEL`	llama3.1:8b-instruct-q4_0	Ollama model name
`OPENAI_API_KEY`		OpenAI API key (required for `openai` or `hybrid` mode)
`OPENAI_MODEL`	gpt-4o-mini	OpenAI model name
`ANTHROPIC_API_KEY`		Anthropic API key
`ANTHROPIC_MODEL`	claude-3-haiku-20240307	Anthropic model name

Hybrid Mode: Set LLM_PROVIDER=hybrid to use local Ollama for belief-grounded responses and OpenAI only for real-time queries (weather, traffic, news, stock prices). This saves API costs while still enabling live information lookup.

Belief Ecology Settings

Parameter	Default	Description
`CONFIDENCE_THRESHOLD_DECAYING`	0.3	Threshold to mark belief as decaying
`TENSION_THRESHOLD_MUTATION`	0.5	Trigger mutation proposals
`CLUSTER_SIMILARITY_THRESHOLD`	0.7	Min similarity to join cluster
`REINFORCEMENT_SIMILARITY_THRESHOLD`	0.7	Min similarity for reinforcement
`MAX_ACTIVE_BELIEFS`	10000	Safety limit

Testing and Verification

We ran a full verification suite to make sure everything works as claimed. Here's what we tested and what the results mean.

Unit Tests

PYTHONPATH=$PWD pytest tests/ -q

Current status: 672 passed, 0 failed

Suite	Files	What it covers
tests/agents/	18	All agent modules
tests/core/	5	BEL loop, clustering, timeline, RL integration
tests/rl/	3	Environment, policy, training
tests/api/	1	REST endpoints
tests/verification/	3	Determinism, offline operation, conflict resolution

Verification Experiments

We ran these experiments to produce hard evidence for our claims:

PYTHONPATH=$PWD python experiments/run_all.py

All experiments passed. Here's what each one proves:

Determinism Check (results/determinism_check.json)

Ran the same input sequence twice with seed 12345
Both runs produced identical state hashes: 077ac8e32f721ef8dbb51a3613adf8e1288e9e0c02422af918327956c7dbcbe1
Different seeds (12345 vs 12346) produce different hashes
This proves: given the same inputs and seed, you get byte-for-byte identical outputs

Offline Operation (results/offline_verification.json)

Blocked all network sockets at runtime
Ran 5 core components (belief ingest, conflict resolution, baselines, metrics, decay simulation)
Detected 0 network calls
This proves: the core belief processing works without any network access

Conflict Resolution (results/conflict_resolution_log.json)

Tested 4 conflict scenarios with different confidence levels and ages
Resolution actions are deterministic: WEAKEN for confidence gaps, DEFER for equal strength
9 total cases documented with case IDs and confidence scores
This proves: conflict resolution follows consistent rules, not random decisions

Drift Comparison (results/drift_comparison.json)

Ran 23-turn conversation with reinforcement, contradictions, and duplicates
Compared three systems: plain LLM (no memory), append-only memory, belief ecology
Append-only accumulated 17 beliefs and 2 contradictions
Belief ecology maintained 0 active contradictions (tension-based resolution worked)
This proves: the ecology manages contradictions instead of just accumulating them

Decay Sweep (results/decay_sweep/)

Tested decay factors: 0.999, 0.995, 0.99, 0.97, 0.95
At 0.999: 4 beliefs retained, 9 dropped
At 0.995 and below: 0 beliefs retained, 13 dropped
This proves: decay factor significantly affects retention. Default of 0.995 is aggressive.

Contradiction Benchmark (results/contradiction_benchmark.json)

Tested semantic rule-based detector against 70-case curated corpus
Corpus inspired by SNLI, MultiNLI, SICK benchmarks (Bowman 2015, Williams 2018, Marelli 2014)
See Contradiction Detection section for detailed results

Evidence File Hashes

For reproducibility, here are the SHA256 hashes of our evidence files:

ecfce79e1b80ab06a9c813e3233f634352d4064b760511c6b1b0bb5ff85a829c  results/drift_comparison.json
dcffb0e0f13ff4c28125e61d208cdc2d4c4fa8d36086aa56a3fdaa082d6db0dc  results/determinism_check.json
2be674788c8b5168348ae3e7f1157c807b18b65001ec581a45e501264aac2b95  results/offline_verification.json
391bfd85ce01eb5fa75b393ca2a69986e2114051db31403bc7d02efa522bfe26  results/conflict_resolution_log.json
d4dd4f5c4a777eac6d2954cd25030f74c9a4e7f27275f1c5e19fa21795d247c5  results/decay_sweep/decay_0.995.json

Recent Updates (2026-01-30)

All 638 tests passing. Here's what was fixed and added:

Context Enhancements

Hierarchical Belief Context
- Session-first lookup: beliefs from current conversation shown as "FROM THIS CONVERSATION"
- User-wide fallback: beliefs from previous sessions shown as "FROM PREVIOUS CONVERSATIONS"
- User isolation: user_id is the ceiling - no cross-user data leakage
- LLM now distinguishes "just told me" vs "remembered from before"
Numeric Contradiction Detection
- Detects conflicting numeric values (e.g., "40 degrees" vs "70 degrees")
- Extracts numbers with unit context (degrees, dollars, percent, etc.)
- Triggers tension when same-topic statements have >20% numeric difference
- Reinforcement agent skips beliefs with conflicting numeric values
Hybrid LLM Provider (LLM_PROVIDER=hybrid)
- Routes queries between local Ollama and cloud OpenAI
- Local Ollama: belief-grounded responses, general conversation
- OpenAI: real-time queries (weather, traffic, news, stocks, search)
- Pattern-based detection for live information needs
- Saves API costs by only using cloud when necessary

Bug Fixes

Mutation engineer confidence calculation
- Mutated beliefs preserve relative confidence with tension-based penalty
- Formula: original - 0.1 - (tension * 0.1), floor 0.3
RL environment action decoding
- Fixed boundary condition in test assertion (> changed to >=)
Circular import in storage/core modules
- Moved imports to function level with TYPE_CHECKING guard
ContradictionDetectedEvent enriched
- Added: contradicting_belief_id, belief_content, contradicting_content, similarity_score

Infrastructure

GitHub Actions CI Pipeline (.github/workflows/ci.yml)
- Runs pytest on push/PR to main
- Includes lint checks with ruff
SQLite Persistence (STORAGE_BACKEND=sqlite)
- Beliefs survive server restarts
- Async support via aiosqlite
Session Isolation (session_id on beliefs)
- Multi-user support ready
- Filter beliefs by session
Multiple LLM Providers
- Ollama (default), OpenAI, Anthropic, Hybrid
- Set via LLM_PROVIDER env var
User Authentication
- JWT-based login/register system
- SQLite persistence for user accounts
- Protected routes requiring authentication
- Beliefs associated with user accounts

Authentication

ABES includes a complete user authentication system:

Endpoints

Endpoint	Method	Description
`/auth/register`	POST	Create new account
`/auth/login`	POST	Login, returns JWT token
`/auth/me`	GET	Get current user (requires token)
`/auth/logout`	POST	Logout (client discards token)

How It Works

Register with email, name, password (min 6 chars)
Login to receive JWT token
Include token in Authorization: Bearer <token> header
Beliefs are associated with your user ID

Frontend

The Next.js frontend handles auth automatically:

Redirects to /login if not authenticated
Stores token in localStorage
Shows user name and logout button in header

User Data Storage

User accounts are stored in data/users.db (SQLite). This file is in .gitignore and will never be committed.

Contradiction Detection

The contradiction detection system uses semantic rule-based analysis with embedding similarity as a gate. When two beliefs have high embedding similarity, the semantic detector analyzes them for logical conflicts.

Architecture

Embedding Similarity Gate (threshold 0.5)
         │
         ▼
   Semantic Parser (spaCy)
         │
    ┌────┴────┐
    │         │
Proposition A  Proposition B
    │         │
    └────┬────┘
         ▼
   14 Contradiction Rules
         │
    ┌────┴────────────────────────────────────┐
    │  NEG_DIRECT         QUANT_UNIVERSAL_VS_NONE    │
    │  NEG_PRED_FLIP      QUANT_UNIVERSAL_VS_EXIST   │
    │  MOD_NECESSARY_VS_IMPOSSIBLE  ENT_ATTRIBUTE    │
    │  MOD_FACTUAL_VS_POSSIBLE      ENT_EXCLUSIVE    │
    │  TEMP_SAME_ANCHOR   NUM_VALUE_CONFLICT         │
    │  NUM_UNIT_CONVERTED NUM_COMPARATOR_CONFLICT    │
    └─────────────────────────────────────────┘
         │
         ▼
   Confidence Score + Reason Codes

Benchmark Results

Tested against 70-case curated corpus inspired by:

SNLI (Stanford Natural Language Inference) - Bowman et al. 2015
MultiNLI (Multi-Genre NLI) - Williams et al. 2018
SICK (Sentences Involving Compositional Knowledge) - Marelli et al. 2014

Category	Legacy Detector	Semantic Detector	Δ
Quantifiers	54.5%	81.8%	+27.3%
Numeric/Units	66.7%	83.3%	+16.7%
Entity/Attribute	76.9%	84.6%	+7.7%
Negation	91.7%	83.3%	-8.3%
Modality	72.7%	45.5%	-27.3%
Temporal	63.6%	36.4%	-27.3%

Overall: Legacy 71.4% → Semantic 70.0%

The semantic detector excels where explicit rules exist (quantifiers, numerics, entity attributes) but struggles where spaCy parsing is ambiguous (modality, temporal). The system falls back to legacy heuristics when semantic parsing fails.

Source Files

File	Description
backend/core/bel/semantic_contradiction.py	Semantic detector implementation
backend/agents/contradiction_auditor.py	Integration with belief auditing
data/contradiction_corpus.json	70-case curated corpus
tests/core/test_semantic_contradiction.py	43 unit tests
experiments/contradiction_benchmark.py	Benchmark script
results/contradiction_benchmark.json	Benchmark artifact

Run the Benchmark

PYTHONPATH=$PWD python experiments/contradiction_benchmark.py

Limitations

No major limitations remain. All previously documented limitations have been resolved:

Previous Limitation	Resolution
Rule-based contradiction detection weak on modality/temporal	NLI model fallback (DeBERTa) for uncertain cases
LLM responses may contradict beliefs	Response validator with claim extraction and regeneration
Hybrid routing uses regex patterns	Zero-shot classifier with regex fallback

See the implementation files for details:

nli_detector.py - NLI fallback
response_validator.py - Response validation
query_classifier.py - Zero-shot query routing

Roadmap

Not yet implemented:

License

MIT 2025-2026 Bradley R. Kinnard

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
backend		backend
baselines		baselines
beliefs		beliefs
data		data
docs		docs
experiments		experiments
frontend		frontend
interfaces		interfaces
metrics		metrics
results		results
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
pyproject.toml		pyproject.toml

Aftermath-Technologies-Ltd/adaptive-belief-ecology-system

Folders and files

Latest commit

History

Repository files navigation

ABES - Adaptive Belief Ecology System

Overview

The Chatbot

Why the chatbot exists

How to use it

Key Features

Architecture

Agent Pipeline

Installation

Backend

Frontend

Ollama

Quick Start

Configuration

Core Settings

LLM Settings

Belief Ecology Settings

Testing and Verification

Unit Tests

Verification Experiments

Evidence File Hashes

Recent Updates (2026-01-30)

Context Enhancements

Bug Fixes

Infrastructure

Authentication

Endpoints

How It Works

Frontend

User Data Storage

Contradiction Detection

Architecture

Benchmark Results

Source Files

Run the Benchmark

Limitations

Roadmap

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages