agent-doc Functional Specification

Language-independent specification for the agent-doc interactive document session tool. This document captures the exact behavior a port must reproduce.

1. Overview

agent-doc manages interactive document sessions between a human and an AI agent. The human edits a markdown document, sends diffs to the agent, and the agent's response is appended. Session state is tracked via YAML frontmatter, snapshots, and git commits.

2. Document Format

2.1 Session Document

Frontmatter fields:

agent_doc_session: Document/routing UUID — permanent identifier for tmux pane routing. Legacy alias: session (read but not written).
agent: Agent backend name (overrides config default)
model: Model override (passed to agent backend)
branch: Reserved for branch tracking

All fields are optional and default to null. The body alternates ## User and ## Assistant blocks.

2.2 Frontmatter Parsing

Delimited by ---\n at file start and closing \n---\n. If absent, all fields default to null and entire content is the body.

2.3 Components

Documents can contain named, re-renderable regions called components:

<!-- agent:status -->
content here
<!-- /agent:status -->

Marker format:  (open) and  (close). Names must match [a-zA-Z0-9][a-zA-Z0-9-]*. Components are patched via agent-doc patch.

Per-component behavior is configured in .agent-doc/components.toml (see §7.20).

3. Snapshot System

3.1 Storage

Snapshots live in .agent-doc/snapshots/ relative to CWD. Path: sha256(canonical_path) + ".md".

3.2 Lifecycle

Save: After successful run, full content saved as snapshot
Load: On next run, loaded as "previous" state for diff
Delete: On reset, snapshot removed
Missing: Diff treats previous as empty (entire doc is the diff)

4. Diff Computation

Line-level unified diff via similar crate. Returns +/-/ prefixed lines, or None if unchanged.

Skill-level behavior: The /agent-doc Claude Code skill strips HTML comments () and link reference comments ([//]: # (...)) from both the snapshot and current content before diff comparison. This ensures that comments serve as a user scratchpad without triggering agent responses. This stripping is performed by the skill workflow (SKILL.md §2), not by the CLI itself.

5. Agent Backend

5.1 Trait

fn send(prompt, session_id, fork, model) -> (text, session_id)

5.2 Resolution Order

CLI --agent flag
Frontmatter agent field
Config default_agent
Fallback: "claude"

5.3 Claude Backend

Default: claude -p --output-format json --permission-mode acceptEdits. Session handling: --resume {id} or --continue --fork-session. Appends --append-system-prompt with document-mode instructions. Removes CLAUDECODE env var. Parses JSON: result, session_id, is_error.

5.4 Custom Backends

Config overrides command and args for any agent name.

6. Config

Location: {XDG_CONFIG_HOME}/agent-doc/config.toml (default ~/.config/agent-doc/config.toml).

Fields: default_agent, [agents.{name}] with command, args, result_path (reserved), session_path (reserved).

7. Commands

7.1 run

agent-doc run <FILE> [-b] [--agent NAME] [--model MODEL] [--dry-run] [--no-git]

Compute diff → 2. Build prompt (diff + full doc) → 3. Branch if -b → 4. Send to agent → 5. Update session ID → 6. Append response → 7. Save snapshot → 8. git add -f + commit

First run prompt wraps full doc in <document> tags. Subsequent wraps diff in <diff> tags + full doc in <document>.

7.2 init

agent-doc init <FILE> [TITLE] [--agent NAME] — scaffolds frontmatter + ## User block. Fails if exists.

7.3 diff

agent-doc diff <FILE> — prints unified diff to stdout.

7.4 reset

agent-doc reset <FILE> — clears session ID, deletes snapshot.

7.5 clean

agent-doc clean <FILE> — squashes all agent-doc: commits for file into one via git reset --soft.

7.6 audit-docs

agent-doc audit-docs [--root DIR] — checks CLAUDE.md/AGENTS.md/README.md/SKILL.md for tree path accuracy, line budget (1000), staleness, and actionable content. Exit 1 on issues.

--root DIR overrides auto-detection of the project root directory. Without it, the root is resolved via project markers (Cargo.toml, package.json, etc.), then .git, then CWD fallback.

7.7 start

agent-doc start <FILE> — start Claude in a new tmux pane and register the session.

Ensure session UUID in frontmatter (generate if missing)
Read $TMUX_PANE (must be inside tmux)
Register session → pane in sessions.json
Exec claude (replaces process)

7.8 route

agent-doc route <FILE> [--pane P] — route a /agent-doc command to the correct tmux pane.

Prune stale entries from sessions.json
Ensure session UUID in frontmatter (generate if missing)
Look up pane in sessions.json
If pane alive → tmux send-keys /agent-doc <FILE> + Enter, focus pane
If pane dead or unregistered → lazy-claim to active pane in claude tmux session (or --pane P), register, send command, auto-sync layout for all files in the same window
If no active pane available → auto-start cascade (create session/window, start Claude, register)

7.9 claim

agent-doc claim <FILE> [--position left|right|top|bottom] [--window W] [--pane P] — claim a document for a tmux pane.

Ensure session UUID in frontmatter (generate if missing)
Resolve effective window (see Window Resolution below)
Determine pane: --pane P overrides, else --position resolves via tmux pane geometry, else $TMUX_PANE
Register session → pane in sessions.json, including window ID

Unlike start, does not launch Claude — the caller is already inside a Claude session. Last-call-wins: a subsequent claim for the same file overrides the previous pane mapping. --position is used by the JetBrains plugin to map editor split positions to tmux panes.

Window Resolution:

When --window W is provided:

Check if window W is alive (tmux list-panes -t W)
If alive → use W (no change)
If dead → scan sessions.json for entries with matching project cwd and non-empty window field. For each, check liveness. Use first alive match.
If no alive windows found → fall through to no-window behavior (position detection without window scoping)

This prevents the JetBrains plugin from hitting persistent error balloons when a tmux window dies. The same fallback pattern is used in sync.rs for dead --window handling.

Notifications:

tmux display-message — 3-second overlay on the target pane showing "Claimed {file} (pane {id})"
.agent-doc/claims.log — appends Claimed {file} for pane {id} for deferred display by the SKILL.md workflow on next invocation

7.10 focus

agent-doc focus <FILE> [--pane P] — focus the tmux pane for a session document.

Read session UUID from file's YAML frontmatter (or use --pane override)
Look up pane ID in sessions.json
Run tmux select-window -t <pane-id> then tmux select-pane -t <pane-id>

Exits with error if the pane is dead or no session is registered.

7.11 layout

agent-doc layout <FILE>... [--split h|v] [--window W] — arrange tmux panes to mirror editor split layout.

Resolve each file to its session pane via frontmatter → sessions.json
If --window given, filter to panes registered for that window only
Pick the target window (the one containing the most wanted panes; tiebreak: most total panes)
Break out only registered session panes that aren't wanted (shells and tool panes are left untouched)
Join remaining wanted panes into the target window (tmux join-pane)
Focus the first file's pane (the most recently selected file)

--split h (default): horizontal/side-by-side. --split v: vertical/stacked. Single file falls back to focus. Dead panes and files without sessions are skipped with warnings.

7.12 resync

agent-doc resync — validate sessions.json against live tmux panes.

Load sessions.json
For each entry, check if the pane is alive via tmux has-session
Remove entries with dead panes
Report removed entries and remaining active sessions

7.13 prompt

agent-doc prompt <FILE> — detect permission prompts from a Claude Code session.

Captures tmux pane content, strips ANSI, searches for numbered-option patterns
Returns JSON: { "active": bool, "question": str, "options": [...] }
--answer N navigates to option N and confirms
--all polls all live sessions, returns JSON array

7.14 commit

agent-doc commit <FILE> — git add + commit with auto-generated timestamp.

7.15 skill

agent-doc skill install — write the bundled SKILL.md to .claude/skills/agent-doc/SKILL.md in the current project. Idempotent (skips if content matches).

agent-doc skill check — compare installed skill vs bundled version. Exit 0 if up to date, exit 1 if outdated or missing.

The bundled SKILL.md contains an agent-doc-version frontmatter field set to the binary's version at build time. When the skill is invoked via Claude Code, the pre-flight step compares this field against the installed binary version (agent-doc --version). If the binary is newer, agent-doc skill install runs automatically to update the skill before proceeding.

7.16 outline

agent-doc outline <FILE> [--json] — display markdown section structure with line counts and approximate token counts.

Read file, skip YAML frontmatter
Parse #-prefixed headings into a section tree
For each section: heading text, depth, line number, content lines, approximate tokens (bytes/4)
Content before the first heading appears as (preamble)

Default output: indented text table. --json outputs a JSON array of section objects (heading, depth, line, lines, tokens).

7.17 upgrade

agent-doc upgrade — check crates.io for latest version, upgrade via GitHub Releases binary download → cargo install → pip install (cascade).

Startup version check: On every invocation (except upgrade itself), warn_if_outdated queries crates.io (with a 24h cache at ~/.cache/agent-doc/version-cache.json) and prints a one-line stderr warning if a newer version is available. Errors are silently ignored so normal operation is never blocked.

7.18 plugin

agent-doc plugin install <EDITOR> — download and install the editor plugin from the latest GitHub Release.

agent-doc plugin update <EDITOR> — update an installed plugin to the latest version.

agent-doc plugin list — list available editor plugins and their install status.

Supported editors: jetbrains, vscode. Downloads plugin assets from GitHub Releases (btakita/agent-doc). Prefers signed assets (*-signed.zip) when available, falling back to unsigned. Auto-detects standard plugin directories for each editor (e.g., JetBrains plugin dir via idea.plugins.path or platform defaults, VS Code ~/.vscode/extensions/).

7.19 sync

agent-doc sync --col <FILES>,... [--col <FILES>,...] [--window W] [--focus FILE] — declarative 2D layout sync.

Mirrors a columnar editor layout in tmux. Each --col is a comma-separated list of files. Columns arrange left-to-right; files stack top-to-bottom within each column.

Reconciliation algorithm (simple 2-step detach/attach):

SNAPSHOT — query current pane order in target window
FAST PATH — if current order matches desired, done
DETACH — break-pane unwanted panes out of target window (panes stay alive in solo windows)
ATTACH — join-pane missing desired panes into target window (isolate from shared windows first, then join with correct split direction: -h for columns, -v for stacking)
REORDER — if all panes present but wrong order, break non-first panes out and rejoin in order
VERIFY — confirm final layout matches desired order

7.20 patch

agent-doc patch <FILE> <COMPONENT> [CONTENT] — replace content in a named component.

Read the document and parse component markers (...)
Find the named component (error if not found)
Read replacement content from the positional argument or stdin
Load component config from .agent-doc/components.toml (if present)
Apply pre_patch hook (stdin: content, stdout: transformed content; receives COMPONENT and FILE env vars)
Apply mode: replace (default), append (add after existing), or prepend (add before existing)
If timestamp is true, prefix entry with ISO 8601 UTC timestamp
If max_entries > 0 (append/prepend only), trim to last N non-empty lines
Write updated document
Save snapshot relative to project root
Run post_patch hook (fire-and-forget; receives COMPONENT and FILE env vars)

Component markers: .... Names must match [a-zA-Z0-9][a-zA-Z0-9-]*.

Component config (.agent-doc/components.toml):

[component-name]
mode = "replace"       # "replace" (default), "append", "prepend"
timestamp = false      # Auto-prefix with ISO timestamp
max_entries = 0        # Trim old entries (0 = unlimited)
pre_patch = "cmd"      # Shell command: stdin→stdout transform
post_patch = "cmd"     # Shell command: fire-and-forget

7.21 watch

agent-doc watch [--stop] [--status] [--debounce MS] [--max-cycles N] — watch session files for changes and auto-submit.

Watches files registered in sessions.json for modifications (via notify crate)
On file change (after debounce), runs submit::run() on the changed file
Reactive mode: Stream-mode documents (agent_doc_mode: stream) are discovered with reactive: true and use zero debounce (Duration::ZERO) for instant re-submit on file change. Reactive paths are tracked in a HashSet<PathBuf>.
Loop prevention: changes within the debounce window after a submit are treated as agent-triggered; agent-triggered changes increment a cycle counter; if content hash matches previous submit, stop (convergence); hard cap at --max-cycles (default 3)
--stop sends SIGTERM to the running daemon (via .agent-doc/watch.pid)
--status reports whether the daemon is running
--debounce sets the debounce delay in milliseconds (default 500)

8. Session Routing

8.1 Registry

sessions.json maps document session UUIDs to tmux panes:

{
  "cf853a21-...": {
    "pane": "%4",
    "pid": 12345,
    "cwd": "/path/to/project",
    "started": "2026-02-25T21:24:46Z",
    "file": "tasks/plan.md",
    "window": "1"
  }
}

Multiple documents can map to the same pane (one Claude session, multiple files). The window field (optional) enables window-scoped routing — claim --window and layout --window use it to filter panes to the correct IDE window.

8.2 Use Cases

#	Scenario	Command	What Happens
U1	First session for a document	`agent-doc start plan.md`	Creates tmux pane, launches Claude, registers pane
U2	Submit from JetBrains plugin	Plugin `Ctrl+Shift+Alt+A`	Calls `agent-doc route <file>` → sends to registered pane
U3	Submit from Claude Code	`/agent-doc plan.md`	Skill invocation — diff, respond, write back
U4	Claim file for current session	`/agent-doc claim plan.md`	Skill delegates to `agent-doc claim` → updates sessions.json
U5	Claim after manual Claude start	`/agent-doc claim plan.md`	Fixes stale pane mapping without restarting
U6	Claim multiple files	`/agent-doc claim a.md` then `/agent-doc claim b.md`	Both files route to same pane
U7	Re-claim after reboot	`/agent-doc claim plan.md`	Overrides old pane mapping (last-call-wins)
U8	Pane dies, plugin submits	Plugin `Ctrl+Shift+Alt+A`	`route` detects dead pane → auto-start cascade
U9	Install skill in new project	`agent-doc skill install`	Writes bundled SKILL.md to `.claude/skills/agent-doc/`
U10	Check skill version after upgrade	`agent-doc skill check`	Reports "up to date" or "outdated"
U11	Permission prompt from plugin	PromptPoller polls `prompt --all`	Shows bottom bar with numbered hotkeys in IDE
U12	Claim notification in session	Skill reads `.agent-doc/claims.log`	Prints claim records, truncates log
U13	Clean up dead pane mappings	`agent-doc resync`	Removes stale entries from sessions.json

8.3 Claim Semantics

claim binds a document to a tmux pane, not a Claude session. The pane is the routing target — route sends keystrokes to the pane. Claude sessions come and go (restart, resume), but the pane persists. If Claude restarts on the same pane, routing still works without re-claiming.

Last-call-wins: any claim overwrites the previous mapping for that document's session UUID.

9. Git Integration

Commit: git add -f {file} (bypasses .gitignore) + git commit -m "agent-doc: {timestamp}" --no-verify
Branch: git checkout -b agent-doc/{filestem}
Squash: soft-reset to before first agent-doc: commit, recommit as one

FilesExpand file tree

SPEC.md

Latest commit

History