You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: wire dormant orchestration 2.0 components into production flow (#669)
Connect five previously-dormant Orch 2.0 components to the execution
pipeline:
- File Lock Registry: bridge claims file locks before instance creation
and releases on all exit paths; uses gate.Release (not Fail) for lock
conflicts to avoid burning retries under concurrent scaling
- Context Propagation: injects prior discoveries into task prompts and
shares completion info for cross-instance awareness
- Mailbox Event Publishing: adds WithBus functional option so all
inter-instance messages publish MailboxMessageEvent to the event bus
- Adaptive Lead Observability: logs scaling signal recommendations in
the pipeline executor
- Approval Auto-Approve: immediately approves gated tasks in the bridge
to prevent stuck states while preserving gate infrastructure
- Debate Protocol: identifies conflicting task outcomes between execution
and review phases and records structured debate sessions for reviewer
context (opt-in via WithDebate pipeline option)
Copy file name to clipboardExpand all lines: AGENTS.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -340,6 +340,7 @@ These are real issues agents have encountered in this codebase. Package-specific
340
340
-**Release locks before blocking on Stop()** — When stopping a component that holds a mutex, copy shared state (e.g., a slice of bridges) under the lock, release the lock, then perform blocking cleanup. Holding a lock while calling `bridge.Stop()` (which calls `wg.Wait()`) blocks goroutines that need the same lock. See `PipelineExecutor.Stop()` in `bridgewire/executor.go`.
341
341
-**Two-phase event publishing for cascading state changes** — When an event handler (`onTeamCompleted`) modifies state that triggers further events of the same type, use a two-phase approach: (1) collect state changes under the lock, (2) publish events outside the lock. Repeat until no new transitions occur. Publishing `TeamCompletedEvent` from within the `onTeamCompleted` handler would re-enter the handler via the synchronous bus, deadlocking on `m.mu`. See `team.Manager.checkBlockedTeamsLocked`.
342
342
-**Semaphore slot lifecycle in bridge** — When the bridge acquires a semaphore slot before `ClaimNext`, it must release on every non-monitor path (claim error, nil task, create/start failure). The monitor goroutine takes ownership of the slot via `defer b.sem.Release()`. Missing a release on any early-return path causes a permanent slot leak that eventually deadlocks the claim loop.
343
+
-**Release vs Fail for scheduling conflicts** — When a task fails due to a scheduling conflict (file lock contention), use `gate.Release()` to return it to pending instead of `gate.Fail()`. `Fail` decrements the retry counter; with scaling enabled, multiple tasks competing for the same resource can exhaust all retries and permanently fail. `Release` puts the task back without consuming retries. Always pair Release with `waitForWake` to prevent hot retry loops.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
9
9
10
10
### Added
11
11
12
+
- **Wire Dormant Orchestration 2.0 Components** - Connected five previously-dormant Orch 2.0 components to the production execution flow: (1) **File Lock Registry** in Bridge prevents concurrent file edits by claiming locks before instance creation and releasing on all exit paths, using Release instead of Fail for lock conflicts to avoid burning retries; (2) **Context Propagation** injects prior discoveries into task prompts and shares completion info for cross-instance awareness; (3) **Mailbox Event Publishing** makes all inter-instance messages visible to the event bus via `MailboxMessageEvent` using a `WithBus` functional option; (4) **Adaptive Lead Observability** logs scaling signal recommendations in the pipeline executor; (5) **Approval Auto-Approve** immediately approves gated tasks to prevent stuck states while preserving gate infrastructure for future interactive use; (6) **Debate Protocol Integration** identifies conflicting task outcomes between execution and review phases and records structured debate sessions for reviewer context (opt-in via `WithDebate()`).
13
+
12
14
-**Orchestration 2.0 Default Execution** - Made Orch 2.0 the default for both UltraPlan and TripleShot. UltraPlan flips `UsePipeline` default to `true`. TripleShot uses `teamwire.TeamCoordinator` with callback-driven execution (replacing file polling), falling back to legacy for adversarial mode or `tripleshot.use_legacy` config. Added `tripleshot.Runner` interface for dual-coordinator coexistence, channel bridge for teamwire callbacks into Bubble Tea, and `NewTripleShotAdapters()` factory to avoid import cycles.
13
15
14
16
-**Pipeline Execution Path** - Wired the Orchestration 2.0 pipeline stack into `Coordinator.StartExecution()`. Added `ExecutionRunner` interface in `orchestrator` (implemented by `bridgewire.PipelineRunner`) with factory-based injection to avoid import cycles. When `UsePipeline` config is enabled, the Coordinator delegates execution to the pipeline backend instead of the legacy `ExecutionOrchestrator`. Subscribes to `pipeline.completed` events for synthesis/failure handling. Guards legacy-only methods (`RetryFailedTasks`, `RetriggerGroup`, `ResumeWithPartialWork`) when pipeline is active.
@@ -43,6 +46,9 @@ These interfaces are implemented by adapters in `internal/orchestrator/bridgewir
43
46
-**Retry limit on completion check errors** — The monitor gives up after `maxCheckErrors` (10) consecutive `CheckCompletion` failures and fails the task. Without this, a bad worktree path would cause indefinite retries.
44
47
-**TaskQueue retry interacts with bridge claim loop** — `TaskQueue.Fail()` has retry logic (`defaultMaxRetries=2`). When the bridge monitor calls `gate.Fail()`, the task may return to `TaskPending` (not permanently failed), and the claim loop re-claims it. Tests that assert on `Running()` after failure must either disable retries via `SetMaxRetries(taskID, 0)` or account for the re-claim cycle.
45
48
-**Always log gate.Fail errors** — `gate.Fail()` can fail if the task has already transitioned. Always check and log the return error rather than discarding with `_ =`.
49
+
-**File lock conflicts use Release, not Fail** — When `ClaimMultiple` returns `ErrAlreadyClaimed`, use `gate.Release` to return the task to pending without burning retries. Using `gate.Fail` would consume retry attempts, and with scaling enabled (semaphore > 1), multiple tasks competing for the same file lock would exhaust retries and permanently fail. After releasing, call `waitForWake` to avoid a hot retry loop.
50
+
-**Record completion/failure before file lock release** — `recorder.RecordCompletion`/`RecordFailure` must be called immediately after `gate.Complete`/`gate.Fail`, before `reg.ReleaseAll` and `shareCompletion`. The gate transition triggers a synchronous event cascade that can complete the pipeline before the monitor goroutine reaches subsequent lines. If the recorder call comes after file lock I/O, tests (and observers) see the pipeline complete before the recorder fires.
51
+
-**Scaling monitor increases semaphore concurrency** — The hub's `ScalingMonitor` reacts to `QueueDepthChangedEvent` and may increase the bridge's semaphore limit via the `OnDecision` callback. Code that assumes semaphore=1 (sequential task execution) is incorrect when scaling is active. File lock claims are the safety net for concurrent access to the same files.
Copy file name to clipboardExpand all lines: internal/mailbox/AGENTS.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,7 @@ See `doc.go` for package overview and API usage.
10
10
-**O_APPEND atomicity** — File writes use `O_APPEND` which is atomic for writes smaller than `PIPE_BUF` (4096 bytes on most systems), but is not crash-safe without `fsync`. This is an accepted trade-off — messages may be lost on hard crash but won't be corrupted or interleaved.
11
11
-**Message ID uniqueness** — `time.UnixNano()` alone is not unique under concurrent access. IDs are generated using an atomic counter combined with PID and timestamp. If you modify ID generation, ensure uniqueness under parallel `Send()` calls.
12
12
-**Store mutex scope** — The `Store` holds a `sync.Mutex` for in-process thread safety. Any method that reads or writes the JSONL file must hold the lock for the entire operation, including the JSON marshal/unmarshal step — not just the file I/O.
13
+
-**WithBus event publishing is synchronous** — When a `Mailbox` is created with `WithBus(bus)`, every successful `Send()` publishes a `MailboxMessageEvent` on the event bus synchronously. Since `event.Bus.Publish` runs handlers inline, callers of `Send` should be aware that handlers may execute significant work in their goroutine. The Hub passes its bus to `NewMailbox` automatically.
0 commit comments