Skip to content

Antonio-Tresol/daemon

Repository files navigation

daemon — algorithmic flow field in void, bone, and ember

daemon

watches the watchers

A monitoring system for long-running Claude Code agents.
built by agents, watched by daemon, improved by humans.

The Problem · What It Does · Harness Engineering · Setup · Architecture



When codebases become optimised harness-engineered systems, agents run for hours across multiple context compaction horizons. They make thousands of tool calls, hit failures they silently recover from, and leave behind sessions that no human has time to read end-to-end.

Someone needs to watch the watchers.


daemon timeline — matryoshka exploration of agent sessions
the timeline — plans within plans, tasks within tasks, events within events


The problem

Agents work autonomously for hours. Sometimes while their operators sleep. A single Claude Code session can span hundreds of tool calls, multiple context compactions, and several hours of wall-clock time.

When something goes wrong, you get a failed PR.

When something goes subtly wrong, you get nothing at all — just a session that looks fine on the surface but made poor architectural decisions, repeated work across compaction boundaries, or silently swallowed errors that will compound later.

  What did the agent actually do during that six-hour session?
  Where did it fail, and what evidence exists?
  What patterns keep recurring?
  How can the harness be improved to prevent these failures?

What daemon does

Daemon ingests events from Claude Code sessions via HTTP hooks and OpenTelemetry, then lets you explore what happened at whatever depth you need.


◎ Multi-resolution exploration

Three levels of depth. Pick the one that matches your question.

  ┌─────────────────────────────────────────────────┐
  │  Narrative    what happened, in thirty seconds  │
  ├─────────────────────────────────────────────────┤
  │  Phases       research → implementation →       │
  │               testing → debugging               │
  ├─────────────────────────────────────────────────┤
  │  Events       every tool call, every decision   │
  └─────────────────────────────────────────────────┘

daemon drilldown — expanding a plan to see raw events
drill down — expand any plan to see the raw events underneath


✗ Failure analysis with evidence

Not just what failed — why, with the receipts. Every failure links back to its source events: the tool calls that preceded it, the error messages, the recovery attempts. The full causal chain.

Classified by impact (critical · warning · info) and type (tool_failure · api_error · logic_error · timeout · permission_denied).


◆ Actionable improvement recommendations

Every failure pattern comes paired with a recommendation targeting the harness itself:

  hooks           auto-enforcement, pre/post-tool hooks
  skills          reusable /commands for common workflows
  subagents       agent teams for parallel work
  tools           MCP servers, integrations
  context         CLAUDE.md, architecture docs
  architecture    layer boundaries, structural lints
  legibility      agent-friendly code organisation

The agent works → daemon watches → the human improves the harness → the agent works better.


● Live session monitoring

daemon session — live event stream with tool calls and failures
a live session — 1597 events, tool calls streaming in, failures highlighted in ember




Harness engineering

Daemon is built on the principles of harness engineering — the discipline of designing environments where agents can do reliable, autonomous work.

The term comes from the infrastructure surrounding a long-running agent: the CLAUDE.md that gives it direction, the hooks that enforce invariants, the tools that extend its capabilities, the feedback loops that catch failures before they compound. Engineering this well is what separates a productive agent from one that drifts.

The foundational ideas draw from:

"Harness engineering: leveraging Codex in an agent-first world" — the engineer's job is no longer writing code but designing environments, specifying intent, and building feedback loops.

"Effective Harnesses for Long-Running Agents" — agents working across context compaction boundaries need structured artifacts to bridge the gap between sessions.

"Long-running Claude for scientific computing" — progress files as portable long-term memory, reference implementations as test oracles, git as the coordination mechanism.

Where harness engineering tells you how to build the environment, daemon tells you how well the environment is working.

See docs/harness-engineering.md for the deep dive.




How it works

Daemon monitors Claude Code sessions through two channels:

  Claude Code ──── HTTP hooks ───→ POST /api/events ───→  ┐
                                                          ├──→ SQLite ──→ Analysis ──→ UI
  Claude Code ──── OpenTelemetry ─→ POST /api/otel ────→  ┘

When you trigger analysis, daemon uses the Claude Agent SDK with structured outputs to process the session's event stream and produces typed timeline, failure, and improvement results.

Get started →

See docs/setup.md for installation and configuration.




Architecture

Domain-Driven Design backend. Feature-Sliced Design frontend. Three colours.

  src/
    server/
      domain/           pure entities, repository interfaces, ports
      application/      use cases orchestrating domain + infrastructure
      infrastructure/   SQLite, Claude Agent SDK runner, GraphQL
    shared/             UI primitives, utilities, hooks
    entities/           entity models and display components
    features/           timeline · failures · improvements · session · harness
    app/                Next.js pages and API routes
    prompts/            LLM prompt templates for analysis

See docs/architecture.md for the full technical breakdown.


Design

  VOID     #0a0a0a    neon black, the deepest background
  BONE     #f0ece5    neon white, warm Anthropic parchment
  EMBER    #d4a574    Claude's warm amber, the only accent

Three colours. Symbols for status. Typography for hierarchy. Nothing else.

See DESIGN.md for the full design system.




Status

Proof of concept · v0.1

Daemon currently monitors Claude Code only. The core timeline, failure analysis, and improvement recommendation features are functional. The system was used to monitor its own development — Claude Code agents built daemon while daemon watched them work.


Documentation

  docs/setup.md                 installation, hook connection, agent API
  docs/architecture.md          DDD backend, FSD frontend, data flows
  docs/harness-engineering.md   founding principles, daemon's position
  DESIGN.md                     three-colour visual language

References



daemon watches the watchers.

About

watches the watchers: a monitoring system for long-running Claude Code agents

Topics

Resources

License

Stars

Watchers

Forks

Contributors