Tasks progress through a well-defined set of states. Every transition is recorded as an immutable event in data/<uuid>/traces/.
stateDiagram-v2
[*] --> backlog
backlog --> in_progress : drag / autopilot
in_progress --> in_progress : max_tokens / pause_turn (auto-continue)
in_progress --> waiting : end_turn / empty stop_reason
in_progress --> failed : error / timeout / budget
in_progress --> backlog : reset
in_progress --> cancelled : cancel
committing --> done : commit success
committing --> failed : commit failure
waiting --> in_progress : feedback
waiting --> in_progress : test (IsTestRun)
waiting --> committing : mark done
waiting --> cancelled : cancel
failed --> backlog : retry / auto_retry
failed --> cancelled : cancel
done --> cancelled : cancel
cancelled --> backlog : retry
note right of waiting
sync: rebase onto default branch
end note
note right of failed
sync: rebase onto default branch
end note
note right of done
archived flag can be set
end note
note right of cancelled
archived flag can be set
end note
| State | Description |
|---|---|
backlog |
Queued, not yet started |
in_progress |
Container running, agent executing |
waiting |
Claude paused mid-task, awaiting user feedback |
committing |
Transient: commit pipeline running after mark-done |
done |
Completed; changes committed and merged |
failed |
Container error, Claude error, timeout, or budget exceeded |
cancelled |
Explicitly cancelled; sandbox cleaned up, history preserved |
Note: archived is a boolean flag (Archived bool) on the task, not a separate state. Tasks in done or cancelled state can have Archived = true, which moves them to the Archived column in the UI. The state machine has exactly 7 states (backlog, in_progress, waiting, committing, done, failed, cancelled).
Each pass through the loop in runner.go Run():
- Increment turn counter
- Run container with current prompt and session ID
- Save raw stdout to
data/<uuid>/outputs/turn-NNNN.json; stderr (if any) toturn-NNNN.stderr.txt - Parse
stop_reasonfrom agent JSON output:
stop_reason |
is_error |
Result |
|---|---|---|
end_turn |
false | Exit loop → waiting (awaiting review; auto-submit or user marks done to trigger commit pipeline) |
max_tokens |
false | Auto-continue (next iteration, same session) |
pause_turn |
false | Auto-continue (next iteration, same session) |
| empty / unknown | false | Set waiting; block until user provides feedback |
| any | true | Set failed with classified FailureCategory |
- Accumulate token usage (
input_tokens,output_tokens, cache tokens,cost_usd) - Record per-turn usage as
TurnUsageRecord - Check budget limits (
MaxCostUSD,MaxInputTokens); if exceeded →failedwithFailureCategory = budget_exceeded
Claude Code supports --resume <session-id> for session continuity. The first turn creates a new session; subsequent turns (auto-continue or post-feedback) pass the same session ID, preserving the full conversation context.
Setting FreshStart = true on a task skips --resume, starting a brand-new session. This is what happens when a user retries a failed task.
When stop_reason is empty, Claude has asked a question or is blocked. The task enters waiting:
- Worktrees are not cleaned up — the git branch is preserved
- User submits feedback via
POST /api/tasks/{id}/feedback - Handler writes a
feedbackevent to the trace log, then launches a newrunner.Rungoroutine using the existing session ID - The task resumes from exactly where it paused, with the feedback message as the next prompt
Alternatively, the user can mark the task done from waiting, which skips further Claude turns and jumps straight to the commit pipeline.
Any task in backlog, in_progress, waiting, failed, or done can be cancelled via POST /api/tasks/{id}/cancel. The handler:
- Kills the container (if
in_progress) — sends<runtime> kill wallfacer-<uuid>. The running goroutine detects the cancelled status and exits without overwriting it tofailed. - Cleans up worktrees — removes the git worktree and deletes the task branch, discarding all prepared changes.
- Sets status to
cancelledand appends astate_changeevent. - Preserves history —
data/<uuid>/traces/anddata/<uuid>/outputs/are left intact so execution logs, token usage, and the event timeline remain visible.
From cancelled, the user can retry the task (moves it back to backlog) to restart from scratch.
When a task transitions to failed, the runner classifies the failure into one of these categories:
| Category | Description |
|---|---|
timeout |
Per-turn timeout exceeded |
budget_exceeded |
Cost or token budget limit reached |
worktree_setup |
Git worktree creation failed |
container_crash |
Container exited unexpectedly |
agent_error |
Agent reported an error in its output |
sync_error |
Rebase/sync operation failed |
unknown |
Unclassifiable failure |
The category is stored in Task.FailureCategory and included in RetryRecord when the task is reset for retry.
Tasks can have an AutoRetryBudget map[FailureCategory]int that specifies how many automatic retries are allowed for each failure category. When a task fails:
- The failure is classified into a
FailureCategory - If the budget for that category has remaining retries, the count is decremented
- The task is automatically reset to
backlogfor a fresh run AutoRetryCounttracks the total number of auto-retries consumed
A global cap (maxTotalAutoRetries) prevents infinite retry loops regardless of per-category budgets.
Each time a task is reset for retry (manual or automatic), a RetryRecord is appended to Task.RetryHistory:
RetryRecord {
RetiredAt time.Time
Prompt string
Status TaskStatus
Result string // truncated to 2000 chars
SessionID string
Turns int
CostUSD float64
FailureCategory FailureCategory
}
The list is capped at DefaultRetryHistoryLimit (10) entries. This allows operators to inspect the history of failed attempts.
When a task is created, a background goroutine (runner.GenerateTitle) launches a lightweight container to generate a short title from the prompt. Titles are stored on the task and displayed on the board cards instead of the full prompt text. POST /api/tasks/generate-titles can retroactively generate titles for older untitled tasks.
Before running a task, users can have an AI agent analyse the codebase and produce a detailed implementation spec (the refined prompt) plus a concise goal summary. Only backlog tasks can be refined.
sequenceDiagram
participant User
participant Handler
participant Runner
participant Container
User->>Handler: POST /api/tasks/{id}/refine
Handler->>Runner: RunRefinementBackground()
Handler-->>User: 202 Accepted
User->>Handler: GET /api/tasks/{id}/refine/logs (SSE)
Runner->>Container: Launch sandbox
Container-->>Handler: Stream output
Handler-->>User: SSE events
alt Container succeeds
Container-->>Runner: Result = goal + spec
Runner->>Runner: Status = "done"
else Container fails
Container-->>Runner: Error
Runner->>Runner: Status = "failed"
end
alt User applies
User->>Handler: POST /api/tasks/{id}/refine/apply
Handler->>Handler: Save RefinementSession
Handler->>Handler: Move Prompt to PromptHistory
Handler->>Handler: Set Prompt = spec, Goal = summary (unless manually set)
Handler->>Runner: Trigger title regeneration
else User dismisses
User->>Handler: POST /api/tasks/{id}/refine/dismiss
Handler->>Handler: Clear CurrentRefinement
else User cancels
User->>Handler: DELETE /api/tasks/{id}/refine
Handler->>Container: Kill container
Handler->>Handler: Status = "failed"
end
Both RefineSessions []RefinementSession (past history) and CurrentRefinement *RefinementJob (present job) live on the Task struct. RefineSessions grows over time as each refinement is applied (capped at DefaultRefineSessionsLimit = 5); CurrentRefinement is replaced on each new run and cleared on dismiss.
Once a task has reached waiting (Claude finished but the user hasn't committed yet), a test verification agent can be triggered to check whether the implementation meets acceptance criteria.
POST /api/tasks/{id}/test
body: { criteria?: string } // optional additional acceptance criteria
↓
Sets IsTestRun = true, clears LastTestResult.
Transitions waiting → in_progress.
Launches a fresh container (separate session, no --resume) with a test prompt.
Test agent runs (IsTestRun = true):
Container executes: inspect code, run tests, verify requirements.
Agent must end its response with **PASS** or **FAIL**.
On end_turn:
parseTestVerdict() extracts "pass", "fail", or "unknown" from the result.
Records verdict in LastTestResult.
Transitions in_progress → waiting (no commit).
Test output is shown separately from implementation output in the task detail panel.
The test verdict is displayed as a badge on the task card and in the task detail panel. Multiple test runs are allowed; each overwrites the previous verdict. The TestRunStartTurn field records which turn the test started so the UI can split implementation vs. test output.
After reviewing the verdict, the user can:
- Mark the task done (commit pipeline runs) if the verdict is PASS
- Provide feedback to fix issues, then re-test
- Cancel the task
See Automation.
Each container receives a read-only board.json at /workspace/.tasks/board.json containing a manifest of all non-archived tasks. The current task is marked "is_self": true. This gives agents cross-task awareness to avoid conflicting changes with sibling tasks. The manifest is refreshed before every turn.
When MountWorktrees is enabled on a task, eligible sibling worktrees are also mounted read-only at /workspace/.tasks/worktrees/<short-id>/<repo>/.
See Data & Storage for data model definitions.
See Data & Storage.
On startup, RecoverOrphanedTasks in runner/recovery.go reconciles tasks that were interrupted by a server restart. It first queries the container runtime to determine which containers are still running, then handles each interrupted task as follows:
| Previous status | Container state | Recovery action |
|---|---|---|
committing |
any | Inspect worktree: if commit landed after UpdatedAt → done; otherwise → failed |
in_progress |
still running | Stay in_progress; a monitor goroutine watches the container and transitions to waiting once it stops |
in_progress |
already stopped | → waiting — user can review partial output, provide feedback, or mark as done |
If worktrees are missing during recovery, the task is marked failed with FailureCategory = worktree_setup.
Why waiting instead of failed for stopped containers?
The task may have produced useful partial output. Moving to waiting lets the user inspect results and choose the next action (resume with feedback, mark as done, or cancel) rather than forcing a retry from scratch.
Monitor goroutine (monitorContainerUntilStopped):
When a container is found still running after a restart, a background goroutine polls podman/docker ps every 5 seconds. Once the container stops it moves the task from in_progress to waiting with an explanatory output event. If the task was already transitioned by another path (e.g. cancelled by the user) the goroutine exits cleanly.
See Automation.
See Automation.
Server-side output truncation is controlled by WALLFACER_MAX_TURN_OUTPUT_BYTES (default 8 MB). When a turn's stdout or stderr exceeds this limit, the output is truncated and a sentinel is appended. Truncated turn numbers are recorded in Task.TruncatedTurns so the UI can surface warnings.
See Data & Storage.
See Data & Storage.
See Data & Storage.
See Data & Storage.
Tasks can declare dependencies on other tasks via DependsOn []string (a list of task UUID strings). A task with unsatisfied dependencies is not eligible for auto-promotion, even when capacity is available.
POST /api/tasks/batch (Handler.BatchCreateTasks in internal/handler/tasks.go) performs comprehensive DAG validation before creating any tasks. The entire validation phase has zero side effects — if any check fails, no tasks are created.
-
Ref uniqueness: Each task in the batch can have a symbolic
refstring. Duplicate refs are rejected. -
Prompt validation: Every non-idea-agent task must have a non-empty prompt.
-
Sandbox validation: Each task's sandbox configuration is validated against available sandbox types.
-
Dependency reference resolution: Each
depends_on_refsentry must be either a known batch ref or a syntactically valid UUID. -
Batch-internal cycle detection (Kahn's algorithm): The batch-internal dependency edges are extracted into an adjacency list. Kahn's topological sort algorithm processes nodes with in-degree 0 iteratively:
inDegree[i] = number of batch tasks that task i depends on batchAdj[i] = indices of tasks that depend on task i queue = all tasks with inDegree == 0 while queue is not empty: curr = dequeue topoOrder.append(curr) for each next in batchAdj[curr]: inDegree[next]-- if inDegree[next] == 0: enqueue(next) if len(topoOrder) != n: cycle detectedWhen a cycle is detected, the response includes the refs of all tasks involved in the cycle.
-
UUID pre-assignment: UUIDs are pre-assigned for all batch tasks so that cross-reference resolution and full-graph cycle detection can reason about the complete post-creation dependency graph.
-
External dependency existence: Every UUID-format dependency that is not a batch ref is verified to exist in the store via
GetTask. -
Full combined-graph cycle detection: A combined adjacency map is built from all existing tasks plus the proposed batch tasks (using pre-assigned UUIDs). For each new task,
taskReachableInAdjperforms a DFS to check whether the task can reach itself through its dependencies:func taskReachableInAdj(adj map[uuid.UUID][]uuid.UUID, start, target uuid.UUID) bool { // DFS from start; returns true if target is reachable }
This catches cycles that span both existing tasks and new batch tasks.
Tasks are created in topological order (from the Kahn's sort output), so each task's dependencies already exist in the store by the time it is created. The response returns tasks in the original input order with a ref_to_id mapping.
PATCH /api/tasks/{id} with a depends_on field calls UpdateTaskDependsOn, which stores the list via mutateTask. No cycle detection is performed on single-task updates — the caller is responsible for ensuring validity.
AreDependenciesSatisfied (in internal/store/tasks_update.go) is the authoritative dependency gate:
func (s *Store) AreDependenciesSatisfied(ctx context.Context, id uuid.UUID) (bool, error) {
// Under RLock:
for each depStr in task.DependsOn:
parse UUID (malformed → unsatisfied)
lookup in s.tasks (missing → unsatisfied, conservative)
if dep.Status != TaskStatusDone → unsatisfied
return true (all deps done)
}Conservative semantics: a malformed UUID or a deleted dependency is treated as unsatisfied to avoid silent unblocking.
The auto-promoter (tryAutoPromote in internal/handler/tasks_autopilot.go) checks dependencies as part of candidate selection:
- Lists all backlog tasks sorted by position.
- For each candidate, calls
AreDependenciesSatisfied— skips if any dependency is notdone. - Also checks
ScheduledAt— skips if the scheduled time is in the future. - Promotes the highest-priority eligible candidate.
Board context gives each running agent visibility into sibling tasks on the board, enabling cross-task coordination and conflict avoidance.
The board context is a BoardManifest JSON file written to /workspace/.tasks/board.json inside each task container:
type BoardManifest struct {
GeneratedAt time.Time `json:"generated_at"`
SelfTaskID string `json:"self_task_id"`
Tasks []BoardTask `json:"tasks"`
}Each BoardTask entry includes:
| Field | Description |
|---|---|
id |
Full UUID string |
short_id |
First 8 characters of the UUID |
title |
Task display title |
prompt |
Task prompt (truncated to 500 chars for siblings) |
status |
Current task status |
is_self |
true for the task's own entry |
turns |
Turn count (0 for siblings to reduce noise) |
result |
Last result text (truncated to 1000 chars for siblings) |
stop_reason |
Last stop reason |
usage |
Token/cost usage |
branch_name |
Git branch name |
worktree_mount |
Container-side mount path (if eligible) |
created_at / updated_at |
Timestamps |
SessionID is deliberately excluded from BoardTask to prevent session hijacking between tasks.
generateBoardContextAndMounts in internal/runner/board.go produces both the board.json bytes and the sibling worktree mount map:
- Lists all non-archived tasks via
store.ListTasks. - Builds the set of workspace repo paths the self task operates on.
- Filters siblings to only those sharing at least one workspace.
- For the self task: includes full
Prompt,Result, andTurns. - For siblings: truncates
Promptto 500 chars,Resultto 1000 chars, zerosTurns. - Determines eligible sibling worktree mounts (tasks in
waiting,failed, ordonewith extant worktree directories). - Marshals to JSON with indent; logs a warning if the manifest exceeds 64 KB.
Results are cached by (boardChangeSeq, selfTaskID). The cache is invalidated when any store mutation increments the change sequence. Per-turn calls cost nearly nothing when no task has changed since the last generation.
When MountWorktrees is enabled, eligible sibling worktrees are mounted read-only inside the container at /workspace/.tasks/worktrees/<short-id>/<repo-basename>/. Mount eligibility is determined by canMountWorktree:
waitingorfailedtasks: always eligible.donetasks: eligible only if at least one worktree directory still exists on disk.backlog,in_progress,cancelled: not eligible (no worktree, actively modified, or cleaned up).
GET /api/tasks/{id}/board— Returns theBoardManifestas it appears to a specific task (with that task markedis_self: true).GET /api/debug/board— Returns theBoardManifestas seen by a hypothetical new task (no task marked as self,selfTaskID=uuid.Nil).
See Data & Storage.
- Internals Index — Overview and reading order for all internals docs
- Architecture — System design, component overview, and design decisions
- Data & Storage — Data models, persistence, storage layer, migrations, search index, soft delete, task summaries
- Automation — Autopilot, oversight generation, ideation agent
- Git Worktrees — Worktree lifecycle, branch naming, cleanup
- API & Transport — HTTP routes, SSE, metrics, middleware