Skip to content

Commit d0c8477

Browse files
ourinesclaude
andcommitted
feat: add auto-recovery monitor for self-healing task execution
- Add monitor.py: background process that detects errors (rate limit, API error, timeout) and auto-resumes - Update launch.py: automatically starts monitor (use --no-monitor to disable) - Update cleanup.py: kills monitor process on cleanup - Update task-prompt-template.md: guide Claude to write checkpoints in .worktree-task/progress.json - Fix on-session-start.py: handle installed_plugins.json array format - Add CLAUDE.md for project documentation - Bump version to 1.1.0 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
1 parent 0076d3d commit d0c8477

File tree

7 files changed

+600
-24
lines changed

7 files changed

+600
-24
lines changed

.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "worktree-task",
3-
"version": "1.0.0",
3+
"version": "1.2.0",
44
"description": "Manage large coding tasks using git worktrees and background Claude Code sessions. Supports launching, monitoring, resuming, and cleanup of autonomous tasks with alert notifications.",
55
"author": {
66
"name": "ourines"

CLAUDE.md

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
A Claude Code plugin that enables spawning autonomous agent sessions in separate git worktrees via tmux. Manages large coding tasks that would otherwise overflow context by delegating phases to subagents.
8+
9+
## Architecture
10+
11+
```
12+
├── .claude-plugin/plugin.json # Plugin manifest (name, version, repo)
13+
├── skills/worktree-task/SKILL.md # Auto-trigger skill definition
14+
├── commands/*.md # Slash command definitions (/worktree-task:*)
15+
├── hooks/
16+
│ ├── hooks.json # Hook registrations (SessionStart, Stop, SessionEnd)
17+
│ └── handlers/*.py # Hook implementations
18+
├── scripts/*.py # Core Python scripts for task management
19+
└── references/*.md # Prompt templates for spawned agents
20+
```
21+
22+
### Core Flow
23+
24+
1. **launch.py** - Creates worktree + tmux session, launches agent (Claude Code or Codex), sends task prompt
25+
2. **status.py** - Captures tmux pane output, shows git info from worktree
26+
3. **resume.py** - Detects error type (rate_limit, api_error, timeout), sends recovery message
27+
4. **cleanup.py** - Kills tmux session, optionally removes worktree
28+
5. **merge.py / rebase.py** - Spawns Claude in tmux to auto-resolve git conflicts
29+
30+
### Key Design Patterns
31+
32+
- **tmux as state**: Sessions track running tasks, pane capture provides status
33+
- **Worktrees for isolation**: Each task gets its own working directory
34+
- **Template substitution**: `$TASK_DESCRIPTION`, `$WORKTREE_DIR` in `references/*.md`
35+
- **Hook-based updates**: `on-session-start.py` checks GitHub Releases API for updates (24h cache)
36+
37+
## Common Commands
38+
39+
```bash
40+
# Test a script directly
41+
python3 scripts/launch.py feature/test "Test task description"
42+
python3 scripts/status.py # List all sessions
43+
python3 scripts/status.py <session-name> # Detailed session status
44+
python3 scripts/resume.py <session> --check # Check without sending message
45+
python3 scripts/cleanup.py <session> --remove-worktree
46+
47+
# Manual tmux operations
48+
tmux list-sessions
49+
tmux attach -t <session>
50+
tmux kill-session -t <session>
51+
52+
# Git worktree operations
53+
git worktree list
54+
git worktree remove <path>
55+
```
56+
57+
## Agent Command Options
58+
59+
The plugin supports multiple agent backends via launch.py:
60+
61+
| Flag | Command |
62+
|------|---------|
63+
| (default) | `claude --dangerously-skip-permissions` |
64+
| `--codex` | `codex --yolo -m gpt-5.1-codex-max -c model_reasoning_effort="high"` |
65+
| `--agent-cmd "<cmd>"` | Custom command |
66+
| `--env KEY=VALUE` | Add environment variables |
67+
68+
## Conventions
69+
70+
- Session names: `<project>-<branch>` with `/` and `.` replaced by `-`
71+
- Worktree paths: `../<project>-<branch-safe>` (parent directory)
72+
- Temp files: `/tmp/claude_task_prompt.txt`, `/tmp/claude_merge_prompt.txt`
73+
- Cache: `~/.claude/plugins/.worktree-task-update-cache.json`
74+
75+
## Hook System
76+
77+
Hooks are defined in `hooks/hooks.json` and triggered by Claude Code events:
78+
79+
- **SessionStart** (startup): Checks for plugin updates via GitHub API
80+
- **Stop**: Task completion notification (macOS `osascript`)
81+
- **SessionEnd**: Session end handler
82+
83+
Hook handlers receive JSON on stdin and output JSON with optional `systemMessage`.

hooks/handlers/on-session-start.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,8 +100,19 @@ def get_installed_plugin_info() -> dict:
100100
with open(plugins_file, "r") as f:
101101
data = json.load(f)
102102

103-
plugin_info = data.get("plugins", {}).get(PLUGIN_ID, {})
104-
if plugin_info:
103+
plugin_list = data.get("plugins", {}).get(PLUGIN_ID, [])
104+
# plugin_list is an array (may have multiple scopes: user, local, etc.)
105+
# Prefer "user" scope, fallback to first entry
106+
plugin_info = None
107+
if isinstance(plugin_list, list) and plugin_list:
108+
for entry in plugin_list:
109+
if entry.get("scope") == "user":
110+
plugin_info = entry
111+
break
112+
if not plugin_info:
113+
plugin_info = plugin_list[0]
114+
115+
if plugin_info and isinstance(plugin_info, dict):
105116
result["version"] = plugin_info.get("version", "")
106117
result["gitCommitSha"] = plugin_info.get("gitCommitSha", "")
107118
result["installPath"] = plugin_info.get("installPath", "")

references/task-prompt-template.md

Lines changed: 52 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,39 @@ Task tool:
4444
4. **Update TodoWrite** - Mark completed, add discovered tasks
4545
5. **Verify at end** - Build passes, tests pass
4646

47+
## Checkpoint System
48+
49+
A monitor process is watching this session and will auto-recover from errors (rate limits, API errors, etc.).
50+
To help recovery, maintain checkpoint state in `.worktree-task/progress.json`:
51+
52+
### Update Checkpoint After Each Phase
53+
54+
```python
55+
# Read current state
56+
import json
57+
from pathlib import Path
58+
59+
progress_file = Path(".worktree-task/progress.json")
60+
progress = json.loads(progress_file.read_text()) if progress_file.exists() else {}
61+
62+
# Update after completing a phase
63+
progress["current_phase"] = "phase-3"
64+
progress["completed_phases"] = ["phase-1", "phase-2", "phase-3"]
65+
progress["next_action"] = "Implement database models"
66+
progress["last_commit"] = "abc1234"
67+
68+
progress_file.parent.mkdir(exist_ok=True)
69+
progress_file.write_text(json.dumps(progress, indent=2))
70+
```
71+
72+
### Recovery Protocol
73+
74+
If you see a message like "Continue from where you left off":
75+
1. Read `.worktree-task/progress.json` to understand current state
76+
2. Check your TodoWrite for remaining tasks
77+
3. Resume from `next_action` or the first incomplete todo
78+
4. Do NOT restart from the beginning
79+
4780
## Execution Mode
4881

4982
- **SILENT MODE**: You are in a worktree, user has pre-approved ALL operations
@@ -59,18 +92,30 @@ Use TodoWrite throughout:
5992
- Mark `completed` immediately when done (never batch)
6093
- Only ONE task should be `in_progress` at a time
6194

95+
## Completion Signal
96+
97+
When ALL tasks are done, output exactly:
98+
99+
```
100+
ALL TASKS COMPLETED
101+
```
102+
103+
This signals the monitor to stop and send completion notification.
104+
62105
## Example Workflow
63106

64107
```
65108
1. Read design docs / specs
66109
2. TodoWrite: Create 8 phase todos
67-
3. TodoWrite: Mark phase 1 as in_progress
68-
4. Task tool: Execute phase 1 (project setup)
69-
5. TodoWrite: Mark phase 1 completed, phase 2 in_progress
70-
6. Task tool: Execute phase 2 (database)
71-
7. ... continue for all phases ...
72-
8. Final verification
73-
9. Commit and report completion
110+
3. Update .worktree-task/progress.json with plan
111+
4. TodoWrite: Mark phase 1 as in_progress
112+
5. Task tool: Execute phase 1 (project setup)
113+
6. Update progress.json: completed_phases += "phase-1"
114+
7. TodoWrite: Mark phase 1 completed, phase 2 in_progress
115+
8. Task tool: Execute phase 2 (database)
116+
9. ... continue for all phases ...
117+
10. Final verification (build, tests)
118+
11. Output: ALL TASKS COMPLETED
74119
```
75120

76121
## Begin Now

scripts/cleanup.py

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,18 @@ def get_git_root() -> Path:
2828
return Path(result.stdout.strip())
2929

3030

31+
def kill_monitor_process(session_name: str, worktree_dir: Path):
32+
"""Kill any monitor process associated with this session."""
33+
# Find and kill monitor process by matching command line
34+
result = run(f"pgrep -f 'monitor.py {session_name}'", check=False, capture=True)
35+
if result.returncode == 0 and result.stdout.strip():
36+
pids = result.stdout.strip().split('\n')
37+
for pid in pids:
38+
if pid:
39+
run(f"kill {pid}", check=False)
40+
print(f" ✓ Killed monitor process (PID: {pid})")
41+
42+
3143
def main():
3244
if len(sys.argv) < 2:
3345
print("Usage: cleanup.py <session-name> [--remove-worktree]")
@@ -46,6 +58,16 @@ def main():
4658
print("=== Worktree Task Cleanup ===")
4759
print()
4860

61+
# Kill monitor process first
62+
try:
63+
project_dir = get_git_root()
64+
project_name = project_dir.name
65+
worktree_dir = project_dir.parent / f"{project_name}-{session_name}"
66+
print("Killing monitor process...")
67+
kill_monitor_process(session_name, worktree_dir)
68+
except subprocess.CalledProcessError:
69+
worktree_dir = None
70+
4971
# Kill tmux session
5072
print(f"Killing tmux session: {session_name}")
5173
if session_exists(session_name):
@@ -55,14 +77,15 @@ def main():
5577
print(" ⚠ Session not found (may already be closed)")
5678
print()
5779

58-
# Find worktree path
59-
try:
60-
project_dir = get_git_root()
61-
project_name = project_dir.name
62-
worktree_dir = project_dir.parent / f"{project_name}-{session_name}"
63-
except subprocess.CalledProcessError:
64-
print("Warning: Not in a git repository, cannot determine worktree path")
65-
worktree_dir = None
80+
# Find worktree path (if not already set)
81+
if worktree_dir is None:
82+
try:
83+
project_dir = get_git_root()
84+
project_name = project_dir.name
85+
worktree_dir = project_dir.parent / f"{project_name}-{session_name}"
86+
except subprocess.CalledProcessError:
87+
print("Warning: Not in a git repository, cannot determine worktree path")
88+
worktree_dir = None
6689

6790
# Show worktrees
6891
print("=== Git Worktrees ===")

scripts/launch.py

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -98,13 +98,34 @@ def load_task_template(script_dir: Path, task_desc: str, worktree_dir: str) -> s
9898
return template
9999

100100

101+
def start_monitor(session_name: str, worktree_dir: Path, script_dir: Path):
102+
"""Start the monitor process in the background."""
103+
monitor_script = script_dir / "monitor.py"
104+
log_file = worktree_dir / ".worktree-task" / "monitor.log"
105+
106+
# Ensure log directory exists
107+
log_file.parent.mkdir(exist_ok=True)
108+
109+
# Start monitor as a background process
110+
cmd = f"nohup python3 \"{monitor_script}\" {session_name} \"{worktree_dir}\" > \"{log_file}\" 2>&1 &"
111+
subprocess.Popen(cmd, shell=True, start_new_session=True)
112+
113+
101114
def main():
102115
if len(sys.argv) < 3:
103-
print("Usage: launch.py <branch-name> \"<task-description>\" [--env KEY=VALUE ...] [--agent-cmd \"<agent command>\"] [--claude] [--codex]")
104-
print("Example: launch.py feature/my-task \"Implement the new feature\"")
105-
print("Example: launch.py feature/my-task \"Task\" --env ANTHROPIC_BASE_URL=http://api.codex.markets")
106-
print("Example: launch.py feature/my-task \"Task\" --agent-cmd \"codex --yolo -m gpt-5.1-codex-max -c model_reasoning_effort=\\\"high\\\"\"")
107-
print("Example: launch.py feature/my-task \"Task\" --codex")
116+
print("Usage: launch.py <branch-name> \"<task-description>\" [options]")
117+
print()
118+
print("Options:")
119+
print(" --env KEY=VALUE Set environment variable")
120+
print(" --agent-cmd \"cmd\" Custom agent command")
121+
print(" --claude Use Claude Code (default)")
122+
print(" --codex Use Codex CLI")
123+
print(" --no-monitor Disable auto-recovery monitor")
124+
print()
125+
print("Examples:")
126+
print(" launch.py feature/my-task \"Implement the new feature\"")
127+
print(" launch.py feature/my-task \"Task\" --codex")
128+
print(" launch.py feature/my-task \"Task\" --no-monitor")
108129
sys.exit(1)
109130

110131
branch_name = sys.argv[1]
@@ -114,6 +135,7 @@ def main():
114135
# Parse custom environment variables and agent command override
115136
custom_env = {}
116137
agent_cmd_override = None
138+
enable_monitor = True # Default: monitor enabled
117139
i = 3
118140
while i < len(sys.argv):
119141
if sys.argv[i] == "--env" and i + 1 < len(sys.argv):
@@ -131,6 +153,9 @@ def main():
131153
elif sys.argv[i] == "--claude":
132154
agent_cmd_override = DEFAULT_AGENT_CMD
133155
i += 1
156+
elif sys.argv[i] == "--no-monitor":
157+
enable_monitor = False
158+
i += 1
134159
else:
135160
i += 1
136161

@@ -235,10 +260,18 @@ def main():
235260
# Cleanup temp file
236261
temp_file.unlink()
237262

263+
# Start monitor process (default: enabled)
264+
if enable_monitor:
265+
print("Starting auto-recovery monitor...")
266+
start_monitor(session_name, worktree_dir, script_dir)
267+
print(" ✓ Monitor running in background")
268+
238269
print()
239270
print("=== Task Launched Successfully ===")
240271
print()
241-
print(f"Monitor: {script_dir}/status.py {session_name}")
272+
if enable_monitor:
273+
print(f"Monitor: tail -f \"{worktree_dir}/.worktree-task/monitor.log\"")
274+
print(f"Status: {script_dir}/status.py {session_name}")
242275
print(f"Attach: tmux attach -t {session_name}")
243276
print(f"Kill: tmux kill-session -t {session_name}")
244277
print(f"Cleanup: {script_dir}/cleanup.py {session_name} --remove-worktree")

0 commit comments

Comments
 (0)