Skip to content

WBChain3/research_loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agentloop — ICM-aligned MCP server for multi-agent research orchestration

A sequential multi-agent research loop exposed as a Model Context Protocol (MCP) server over stdio. Each research run creates a versioned, inspectable filesystem tree under runs/<uuid>/ following the Interpretable Context Methodology (ICM, Van Clief & McDermott 2026, arXiv:2603.16021).

What it does: takes a question, runs researcher_a (first-pass) → researcher_b (second-pass, with first-pass findings as input) → consolidator (synthesis), with automatic drift detection via sem_debug at every stage transition and a user-driven resume protocol when drift is found.

API-agnostic: works with any OpenAI-compatible endpoint. Default provider is ollama-cloud; override via environment variable.


Architecture

The filesystem is the pipeline. Every run lives in runs/<uuid>/ with the following 5-layer ICM structure:

runs/<uuid>/
├── CONTEXT.md            ← run-level question and routing decisions
├── state.json            ← current run state
├── decisions.md          ← append-only audit trail with YAML frontmatter
├── 01_research_a/
│   ├── CONTEXT.md        ← stage contract (from prompts/researcher_a.md)
│   ├── findings.md       ← model output
│   └── decisions.md      ← stage-level decision trail
├── 02_research_b/
│   ├── CONTEXT.md
│   ├── findings.md
│   └── decisions.md
└── 03_consolidate/
    ├── CONTEXT.md
    ├── findings.md       ← final synthesized report
    └── decisions.md

Agents are roles, not separate processes. The MCP server selects the stage contract (prompts/<role>.md) for each invocation.

Role Purpose Stage
orchestrator Routes commands, manages run lifecycle, makes resume decisions meta
researcher_a First-pass research; produces initial findings 01_research_a
researcher_b Second-pass research; reads researcher_a's findings as input 02_research_b
consolidator Synthesizes both passes into a final report 03_consolidate

State machine:

INITIALIZED → RESEARCHING_A → RESEARCHING_B → CONSOLIDATING → DONE

If sem_debug detects drift at any transition, the run pauses at AWAIT_USER. The user resumes via resume_run with one of four actions: accept, rerun, rerun_strict, or abort.

Stage contracts are defined in prompts/<role>.md using YAML frontmatter:

---
stage: 01_research_a
stage_role: first-pass research
model: kimi-k2.6:cloud
agent_id: researcher_a
inputs:
  - field: question
    source: ../CONTEXT.md
outputs:
  - field: findings
    path: findings.md
    contract: "Markdown analysis. Last line must be exactly: PASS_1_COMPLETE"
sem_debug_check:
  trigger: after_outputs
  check: "Does findings.md address the question directly without unsupported claims?"
  on_drift: AWAIT_USER
---

Prerequisites

  • Python 3.11+
  • An account with ollama-cloud (or any OpenAI-compatible API endpoint)
  • sem_debug (optional, for drift detection; install separately from WBChain3/sem_debug)

Install

git clone https://github.com/WBChain3/research_loop.git
cd research_loop
pip install -e .

Dependencies are managed in pyproject.toml:

  • pyyaml, httpx, python-dotenv, pydantic, mcp>=1.0

Config

cp .env.example .env
# Edit .env and set OLLAMA_API_KEY=your_key_here
Variable Required Description
OLLAMA_API_KEY Yes API key for ollama-cloud
OLLAMA_BASE_URL No Defaults to https://ollama.com/v1; override for local endpoints
OPENAI_API_KEY No Alternative provider
ANTHROPIC_API_KEY No Alternative provider

Usage

Start the MCP server:

python -m agent_loop.cli

The server runs over stdio, listening for JSON-RPC messages per the MCP protocol. Connect with any MCP client (Claude Desktop, mcp-client, etc.).

Full lifecycle example:

  1. Create a run via start_research with a question.
  2. Advance via advance_run (call 3× for the full pipeline).
  3. Check status via get_run_status at any time.
  4. If drift is detected, the run enters AWAIT_USER. Resume via resume_run with accept, rerun, rerun_strict, or abort.

MCP Tools (6)

Tool Purpose
start_research Create a new research run. Returns run_id, state, context_path.
advance_run Execute the next stage. Call once per stage. Returns the new state, completed stage, and findings path.
get_run_status Query the current state of a run.
list_runs List all runs, newest first.
resume_run Resume a paused run (AWAIT_USER). Actions: accept, rerun, rerun_strict, abort.
abort_run Terminate a run idempotently.

Each tool returns a dict-shaped response. On error, the response contains {"error": "...", "state": "..."} — no raw exceptions leak to the client.


Test

pytest tests/

116 tests, no live model calls, ~26s run time. The suite covers model client, workspace parser, state manager, run manager, trace, prompts, and MCP server.


Decision trail

Every decisions.md has YAML frontmatter (timestamp, actor, stage, event, related_artifacts, summary) and a free-form body. The trail is the audit log; the body is for humans. Every state transition and every resume action is recorded.


License

MIT

About

refactored to agentloop: ICM-aligned multi-agent research loop exposed as an MCP server. Sequential researcher → consolidator pipeline with automatic drift detection via sem_debug and a user-driven resume protocol on drift.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages