Shadowbook

`bd` — keep the story straight, even when the work isn't

$ bd pacman

╭──────────────────────────────────────────────────────────╮
│  ᗧ····○ bd-abc····○ bd-xyz····○ bd-123 ····◐            │
╰──────────────────────────────────────────────────────────╯

YOU: claude | SCORE: 3 dots | #1 codex (5 pts)

$ bd recent --all

test-f2y [P1] Implement OAuth login  ● volatile  ○ open  just now
└─ ● specs/auth.md  ✓ active  ● volatile  just now
test-sgo [P3] Update README  ○ stable  ○ open  just now
└─ ● specs/docs.md  ✓ active  ○ stable  1m ago

Summary: 2 beads, 2 specs | Active: 2 pending | Momentum: 4 items today

One command. Beads, specs, skills—nested by relationship. Drift called out. No guesswork.

Built on beads.

The Formula‑1 Story

Shadowbook is race control for agentic engineering.

Specs are the track. Beads are the cars. Skills are the pit crew. Wobble is tire degradation. Volatility is track instability. Drift is when the car runs a different line than the one you designed.

Shadowbook keeps the race safe:

It flags when the track is changing while cars are already at speed.
It shows which cars are on worn tires (unstable skills) and which are safe to push.
It pauses risky runs when the track is breaking apart.
It gives you a clean lap chart of what's actually happening, not what you hoped happened.

Agent teams are the pit wall — coordinating multiple cars from a single screen. bd team plan is race strategy: which car runs which stint, in what order, on which tires. bd team watch is live telemetry: speed, gaps, tire wear — updated every few seconds. bd team score is championship points: pacman dots awarded per completed stint. bd team wobble is the post-race debrief: did drivers follow the strategy or freelance? bd team gate is track inspection: is the circuit safe to race, or is the surface breaking up? File disjointness is the rule that two cars can't occupy the same piece of track at the same time.

In Formula‑1 terms: Shadowbook is the difference between "full send" and a DNF you didn't see coming.

Six Drifts, One Tool

Drift	Problem	Solution
Spec Drift	Spec changes, code builds old version	`bd spec scan`
Skill Drift	Skills diverge or collide across environments	`bd preflight --check`, `bd skills collisions`
Visibility Drift	Can't see what's active	`bd recent --all`
Stability Drift	Specs churning while work in flight	`bd spec volatility`
Behavioral Drift	Claude "helpfully" deviates from instructions	`bd wobble scan`
Comment Drift	Comments rot while code evolves	`bd cc scan`, `bd cc drift`

Quick Start

curl -fsSL https://raw.githubusercontent.com/anupamchugh/shadowbook/main/scripts/install.sh | bash
cd your-project && bd init && mkdir -p specs
bd recent --all

Dogfooding: Real Numbers

Ran on a 683-spec production codebase (trading platform, 14 months of specs):

Metric	Before	After
Total specs	683	365
Exact duplicates	75	0
Ghost registry entries	393	0
Lines removed	—	110,326
Specs linked to beads	13	13 (preserved)
Time to clean	—	~5 minutes

What sbd found that manual review missed:

75 files duplicated between specs/active/ and specs/reference/ (1.00 similarity)
243 specs older than 7 days with no linked beads (pure noise)
393 stale registry entries pointing to already-deleted files

Snap Streaks

Track spec stability over time. Like Snapchat streaks, but for specs.

$ bd spec volatility --trend specs/auth.md

  Week 1: ████████░░  8 changes
  Week 2: █████░░░░░  5 changes
  Week 3: ██░░░░░░░░  2 changes
  Week 4: ░░░░░░░░░░  0 changes

Status: DECREASING
Prediction: Safe to resume work in ~5 days

Declining = stabilizing. Flat at zero = locked down. Increasing = chaos growing.

Badges everywhere:

$ bd list --show-volatility
  bd-42  [● volatile] Implement login    in_progress
  bd-44  [○ stable]    Update README     pending

$ bd ready
○ Ready (stable): 1. Update README
● Caution (volatile): 1. Implement login (5 changes/30d, 3 open)

Cascade impact:

$ bd spec volatility --with-dependents specs/auth.md

specs/auth.md (● HIGH: 5 changes, 3 open)
├── bd-42: Implement login ← DRIFTED
│   └── bd-43: Add 2FA (blocked)
└── bd-44: RBAC redesign

Impact: 3 issues at risk
Recommendation: STABILIZE

CI gate:

bd spec volatility --fail-on-high  # Exit 1 if HIGH volatility

Auto-pause:

bd config set volatility.auto_pause true
bd resume --spec specs/auth.md  # Unblock after stabilization

Spec Drift Detection

bd create "Implement login" --spec-id specs/login.md
# ... spec changes ...
bd spec scan
● SPEC CHANGED: specs/login.md → bd-a1b2 unaware

bd list --spec-changed    # Find drifted issues
bd update bd-a1b2 --ack-spec  # Acknowledge

Spec Radar Flow

Treat it like a daily weather report for specs.

# Morning: see what moved
bd spec delta

# Midday: clean up ideas
bd spec triage --sort status

# Weekly: generate a briefing
bd spec report --out .beads/reports

# Cleanup day: align lifecycle with reality (confirm before apply)
bd spec sync --apply

Quick reads:

bd spec stale shows age buckets.
bd spec duplicates surfaces overlap.
bd spec report combines summary, triage, staleness, duplicates, delta, and volatility.

Skill Sync

bd preflight --check
✓ Skills: 47/47 synced
✓ Specs: 12 tracked
● Volatility: 2 specs have high churn

bd preflight --check --auto-sync  # Fix drift

Wobble: Measure the Drift

     You write the recipe. Claude edits it.

     Expected:  bd list --created-after=$(date -v-1d) --sort=created
     Actual:    bd list --status=in_progress  ← "I thought this would help"

                    ᗧ····~····~····~····
                         wobble →

Based on Anthropic's "Hot Mess of AI" paper: extended reasoning amplifies incoherence. Wobble catches it.

$ bd wobble scan --from-sessions --days 7

┌─ WOBBLE SCAN: REAL SESSION DATA ───────────────────────┐
│ Analyzed 18 skills with REAL session data             │
└────────────────────────────────────────────────────────┘

┌─ WOBBLE REPORT: my-skill (REAL DATA) ──────────────────┐
│ Invocations: 6                                         │
│ Exact Match Rate: 33%                                  │
│ Variants Found: 5                                      │
│ Wobble Score: 0.85                                     │
│                                                        │
│ VERDICT: ● UNSTABLE                                    │
└────────────────────────────────────────────────────────┘

The formula (from the paper):

Wobble = Variance / (Bias² + Variance)

High wobble = Claude does something different every time
High bias   = Claude consistently does the wrong thing

Structural risk factors that predict high wobble:

No EXECUTE NOW section with explicit command
Multiple options without (default) marker
Content > 4000 chars (Claude overthinks)
Missing "DO NOT IMPROVISE" constraint
Numbered steps without clear default

Two modes:

# Simulated analysis (fast, no history needed)
bd wobble scan my-skill

# Real session analysis (parses actual Claude behavior)
bd wobble scan --from-sessions --days 14

# Rank all skills by risk
bd wobble scan --all --top 10

# Project health audit
bd wobble inspect . --fix

Drift dashboard:

bd drift

Shows last wobble scan, stable/wobbly/unstable counts, skills fixed since last scan, and spec/bead drift summary.

Cascade impact:

bd cascade beads

Lists known dependents from the wobble store (.beads/wobble/skills.json).

Fixing wobbly skills:

## EXECUTE NOW

**Run this immediately:**
```bash
your-exact-command --with-flags

Do NOT improvise. Run the command above first.


---

## Auto-Compaction

```bash
bd spec candidates        # Score specs for archival
bd spec compact specs/old.md --summary "Done. 3 endpoints."
bd close bd-xyz --compact-spec --compact-skills

Comment Drift Detection

Comments break silently. bd codecomment (alias: bd cc) treats them as tracked entities.

$ bd cc scan

Scanning comments...
  ├─ 15,816 comments found (3,389 doc, 35 todo, 9 invariant, 46 reference, 12,337 inline)
  ├─ 50 cross-references detected
  ├─ 22 broken references found
  ├─ 538 files scanned
  └─ Completed in 226ms

$ bd cc drift

┌─ COMMENT DRIFT REPORT ─────────────────────────────────────┐
│ BROKEN REFERENCES (8):                                      │
│   🔴 sync_branch.go:178 → autoflush.go:findJSONLPath       │
│ STALE COMMENTS (189):                                       │
│   ⚠️  types.go:873 → code changed 76 days after comment     │
│ EXPIRED TODOs (5):                                          │
│   ⏰ beads.go:310 → TODO is 104 days old                   │
└─────────────────────────────────────────────────────────────┘

$ bd cc links --broken    # Show only broken cross-references
$ bd cc links --file auth.go  # Reference graph for one file
$ bd cc scan --json       # JSON output for CI

Uses go/ast for parsing, git blame --porcelain (per-file, batched) for staleness, and stores the comment graph in .beads/comments.db.

Commands

Command	Action
`bd recent --all`	Activity dashboard with volatility
`bd ready`	Work queue, partitioned by volatility
`bd ready --mine`	Work queue filtered to your assignments
`bd list --show-volatility`	Badges: ● volatile / ○ stable
`bd spec scan`	Detect spec changes
`bd spec stale`	Show specs by staleness bucket
`bd spec triage`	Triage specs/ideas by age and git status
`bd spec duplicates`	Find duplicate or overlapping specs
`bd spec delta`	Show spec changes since last scan
`bd spec report`	Generate full spec radar report
`bd spec align`	Spec ↔ bead ↔ code alignment report
`bd spec sync`	Sync spec lifecycle from linked beads
`bd spec volatility`	List specs by stability
`bd spec volatility --trend <spec>`	4-week visual trend
`bd spec volatility --with-dependents <spec>`	Cascade impact
`bd spec volatility --recommendations`	Action items
`bd spec volatility --fail-on-high`	CI gate
`bd preflight --check`	Skills + specs + volatility
`bd resume --spec <path>`	Unblock paused issues
`bd assign <id> --to <agent>`	Assign a bead to someone
`bd wobble scan <skill>`	Analyze skill for drift risk
`bd wobble scan --all`	Rank all skills by wobble risk
`bd wobble scan --from-sessions`	Use REAL session data
`bd wobble inspect .`	Project skill health audit
`bd drift`	Wobble + spec/bead drift summary
`bd cascade <skill>`	Wobble cascade impact from stored dependents
`bd agent state <id> <state>`	Set agent state (idle/running/stuck/done)
`bd agent heartbeat <id>`	Update agent alive timestamp
`bd agent show <id>`	Show agent bead details
`bd slot set <id> hook <bead>`	Attach work to agent's hook
`bd slot show <id>`	Show agent's current work
`bd slot clear <id> hook`	Detach work from agent
`bd reflect`	Session-end retrospective (close beads, capture lessons, flag debt)
`bd reflect --non-interactive`	Summary only, no prompts
`bd pacman`	Pacman mode: dots (ready work), blockers, leaderboard
`bd pacman --pause "reason"`	Pause signal for other agents (file-based)
`bd pacman --resume`	Clear pause signal
`bd pacman --join`	Register agent in .beads/agents.json
`bd pacman --eat <id>`	Close task + increment score (hidden flag)
`bd pacman --global`	Workspace-wide view across all projects
`bd pacman --badge`	Generate GitHub profile badge
`bd team plan <epic>`	Epic DAG → team execution plan (JSON or human-readable)
`bd team watch`	Live dashboard of agent team progress
`bd team score`	Pacman leaderboard for team session
`bd team wobble`	Post-session drift check: did agents follow briefs?
`bd team gate <spec>`	Spec volatility check before team assignment
`bd team report`	Full post-mortem with per-agent metrics

Pacman Mode (Multi-Agent)

Gamified task management for coordinating multiple agents. No server required.

$ bd pacman

╭──────────────────────────────────────────────────────────╮
│  ᗧ····○ bd-abc····○ bd-xyz····○ bd-123 ····◐            │
╰──────────────────────────────────────────────────────────╯

YOU: claude
SCORE: 3 dots

DOTS NEARBY:
  ○ bd-abc ● P1 "Implement login flow"
  ○ bd-xyz ● P2 "Add retry logic"

ACHIEVEMENTS:
  ✓ First Blood
  ✓ Streak 5
  ✓ Ghost Buster

Tip: `bd pacman --global` aggregates dots and scores across your workspace.

BLOCKERS:
  ● bd-456 blocked by bd-789

LEADERBOARD:
  #1 codex   5 pts
  #2 claude  3 pts

All tasks done? Pacman clears the maze:

╭──────────────────────────────────────────────────────────╮
│  ᗧ····················✓ CLEAR!                            │
╰──────────────────────────────────────────────────────────╯

Multi-Agent Scenarios

Two agents, same project:

# Codex joins and works
AGENT_NAME=codex bd pacman --join
bd pacman --eat bd-123              # Close + score

# You check progress
bd pacman                           # See leaderboard

Session handoff (day → night):

# End of day
git push

# Codex overnight
git pull && AGENT_NAME=codex bd pacman --join
bd pacman --eat bd-456
git push

# Next morning
git pull && bd pacman               # See overnight work

Emergency stop all agents:

bd pacman --pause "PRODUCTION DOWN"
# Every agent's next bd command shows warning

bd pacman --resume                  # After incident

Workspace-Wide View

$ bd pacman --global

╭──────────────────────────────────────────────────────────╮
│  GLOBAL PACMAN · 5 projects · 42 dots · 8 ghosts        │
╰──────────────────────────────────────────────────────────╯

YOU: claude
TOTAL SCORE: 15 dots across all projects

PROJECTS:
  18○ project-alpha              (5 pts) ◐3
  12○ project-beta               (3 pts) ◐5
  8○  api-backend                (2 pts)
  4○  mobile-app                 (5 pts)
  ✓   my-tool                    (10 pts)

Files (All Git-Tracked)

.beads/
├── agents.json       # Who's playing
├── scoreboard.json   # Points per agent
└── pause.json        # Pause signal (when active)

Why Files, Not Server?

Aspect	Server	Files
Agent dies	Inbox stuck	Files persist
10 projects	10 registrations	0 registrations
Sync	MCP calls	Git pull/push

Status: Designed, not yet shipped. The primitives exist (bd agent, bd slot, bd gate, bd assign), but the bd team orchestration layer is planned for a future release. The design below shows the intended UX.

Agent Teams Bridge

bd team bridges beads (where work is tracked) to agent teams (where work is executed). Orchestrator-agnostic — outputs JSON that Claude Code, Codex, or any orchestrator can consume.

Plan: Epic DAG → Team Execution Plan

$ bd team plan beads-abc

╭─ Team Plan: IST Normalization + Security Hardening ─────────╮
│                                                              │
│  Wave 1 (parallel):                                          │
│    ○ beads-123  Create time_utils.py          [2 files]      │
│    ○ beads-456  Security audit                [2 files]      │
│    ○ beads-789  Infra health check            [0 files]      │
│                                                              │
│  Wave 2 (parallel, after wave 1):                            │
│    ○ beads-012  Apply IST to resim            [1 file]       │
│      └─ blocked by: beads-123                                │
│                                                              │
│  Validation:                                                 │
│    ✓ File-disjoint (no conflicts)                            │
│    ✓ Max parallelism: 3 agents                               │
│    ✓ Spec volatility: LOW (all specs stable)                 │
│                                                              │
╰──────────────────────────────────────────────────────────────╯

Add --format json for machine-readable output that any orchestrator can pipe directly into team creation.

Watch: Live Agent Dashboard

$ bd team watch

╭─ Team: plan-execution-feb06 ────────────── 03:05:12 IST ───╮
│                                                              │
│  Agents:                                                     │
│    ist-engineer      ● working   Task #1 (IST utility)       │
│    hardening-eng     ● working   Task #3 (Security)          │
│    watchlist-eng     ● working   Task #4 (Snapshot)          │
│    infra-eng         ○ idle      (completed #5, #6)          │
│                                                              │
│  Tasks:                                                      │
│    #1 [████████░░] in_progress  IST utility + resim          │
│    #2 [░░░░░░░░░░] blocked     IST paper daemon (→ #1)      │
│    #3 [██████░░░░] in_progress  Security + async             │
│    #4 [████░░░░░░] in_progress  Watchlist snapshot           │
│    #5 [██████████] completed   Resim runner + board          │
│    #6 [██████████] completed   Health check                  │
│                                                              │
│  Progress: 2/6 done │ 3 active │ 1 blocked                  │
│  Pacman:  infra-eng 2 🟡  others 0 🟡                       │
╰──────────────────────────────────────────────────────────────╯

Reads from ~/.claude/teams/ and ~/.claude/tasks/. Refreshes automatically.

Why It Matters

Before	After
~5 min manual `TaskCreate × N`	`bd team plan` in 2 seconds
No visibility from bd	Real-time dashboard with `bd team watch`
Manual bead closure	Auto-close when team tasks complete
No quality check	`bd team wobble` scores agent fidelity
No post-mortem	`bd team report` — one command

Documentation

Snap Streaks — Volatility tracking guide
User Manual — Full usage
Architecture — How it works
AGENTS.md — Agent workflow

Why "Shadowbook"?

Every spec casts a shadow over code. When the spec moves, the shadow should move too.

MIT License · Built on beads

Name		Name	Last commit message	Last commit date
Latest commit History 6,116 Commits
.agent/workflows		.agent/workflows
.beads-hooks		.beads-hooks
.beads		.beads
.claude-plugin		.claude-plugin
.claude		.claude
.devcontainer		.devcontainer
.github		.github
claude-plugin		claude-plugin
cmd/bd		cmd/bd
docs		docs
examples		examples
integrations		integrations
internal		internal
npm-package		npm-package
scripts		scripts
specs		specs
test_project		test_project
tests/integration		tests/integration
website		website
winget		winget
.gitattributes		.gitattributes
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.test-skip		.test-skip
AGENTS.md		AGENTS.md
BLOG_POST_SHADOWBOOK_WESTWORLD.md		BLOG_POST_SHADOWBOOK_WESTWORLD.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DIFFERENCES.md		DIFFERENCES.md
LICENSE		LICENSE
Makefile		Makefile
NEWSLETTER.md		NEWSLETTER.md
PR_CANDIDATES.md		PR_CANDIDATES.md
README.md		README.md
REVIEW.md		REVIEW.md
SECURITY.md		SECURITY.md
THIRD_PARTY_LICENSES		THIRD_PARTY_LICENSES
beads.go		beads.go
beads_test.go		beads_test.go
default.nix		default.nix
flake.lock		flake.lock
flake.nix		flake.nix
go.mod		go.mod
go.sum		go.sum
install.ps1		install.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shadowbook

`bd` — keep the story straight, even when the work isn't

The Formula‑1 Story

Six Drifts, One Tool

Quick Start

Dogfooding: Real Numbers

Snap Streaks

Spec Drift Detection

Spec Radar Flow

Skill Sync

Wobble: Measure the Drift

Comment Drift Detection

Commands

Pacman Mode (Multi-Agent)

Multi-Agent Scenarios

Workspace-Wide View

Files (All Git-Tracked)

Why Files, Not Server?

Agent Teams Bridge

Plan: Epic DAG → Team Execution Plan

Watch: Live Agent Dashboard

Why It Matters

Documentation

Why "Shadowbook"?

About

Uh oh!

Releases

Packages

Languages

License

anupamchugh/shadowbook

Folders and files

Latest commit

History

Repository files navigation

Shadowbook

bd — keep the story straight, even when the work isn't

The Formula‑1 Story

Six Drifts, One Tool

Quick Start

Dogfooding: Real Numbers

Snap Streaks

Spec Drift Detection

Spec Radar Flow

Skill Sync

Wobble: Measure the Drift

Comment Drift Detection

Commands

Pacman Mode (Multi-Agent)

Multi-Agent Scenarios

Workspace-Wide View

Files (All Git-Tracked)

Why Files, Not Server?

Agent Teams Bridge

Plan: Epic DAG → Team Execution Plan

Watch: Live Agent Dashboard

Why It Matters

Documentation

Why "Shadowbook"?

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`bd` — keep the story straight, even when the work isn't

Packages