-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: Improve background agent stability and completion detection #715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Improve background agent stability and completion detection #715
Conversation
- Add max_steps limit to explore (25) and librarian (30) agents - Block sisyphus_task/call_omo_agent tools in librarian to prevent spawning - Add global 15-minute timeout for background tasks (MAX_RUN_TIME_MS) - Simplify session.idle handler - remove validateSessionHasOutput/checkSessionTodos guards - Add JSONC config file support (.jsonc checked before .json) - Fix categoryModel passing in sisyphus_task sync mode Reference: PR code-yeongyu#655, kdcokenny/opencode-background-agents
- Reset startedAt when resuming tasks to prevent immediate completion (MIN_IDLE_TIME_MS check was passing immediately for resumed tasks) - Previous commit already fixed timeout.unref() and double-release prevention
- Set concurrencyKey = undefined after every release() call to prevent double-release when multiple code paths try to release the same key - Add 15-minute timeout timer for resumed tasks (was missing) - Fixes: promptAsync error, session.deleted, pruneStaleTasksAndNotifications
Address P2 feedback: resume() error handler now properly: - Clears the timeout timer created for the resumed task - Releases concurrency key to unblock queued tasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 6 files
Confidence score: 3/5
- Late prompt rejections in
src/tools/sisyphus-task/tools.tsare dropped after the 100 ms fire-and-forget check, so users can see a misleading “No assistant response found” after 10 minutes of polling when the prompt never succeeded. - Early returns on prompt errors leave
subagentSessionsentries uncleared insrc/tools/sisyphus-task/tools.ts, risking stale session tracking and incorrect future runs. - Pay close attention to
src/tools/sisyphus-task/tools.ts– prompt error handling needs to avoid losing failures and leaking session entries.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/tools/sisyphus-task/tools.ts">
<violation number="1" location="src/tools/sisyphus-task/tools.ts:426">
P2: Late prompt rejections are dropped: fire-and-forget prompt is only checked for 100 ms, so failures after that window lead to 10‑minute polling and a misleading "No assistant response found" when the prompt was never sent.</violation>
<violation number="2" location="src/tools/sisyphus-task/tools.ts:445">
P2: Prompt error early return leaves subagentSessions entry uncleared, causing stale session tracking</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| // For category-based tasks, pass the model from category config | ||
| // For agent-based tasks, use agent's configured model (don't pass model in body) | ||
| let promptError: Error | undefined | ||
| client.session.prompt({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Late prompt rejections are dropped: fire-and-forget prompt is only checked for 100 ms, so failures after that window lead to 10‑minute polling and a misleading "No assistant response found" when the prompt was never sent.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/tools/sisyphus-task/tools.ts, line 426:
<comment>Late prompt rejections are dropped: fire-and-forget prompt is only checked for 100 ms, so failures after that window lead to 10‑minute polling and a misleading "No assistant response found" when the prompt was never sent.</comment>
<file context>
@@ -419,21 +419,30 @@ System notifies on completion. Use \`background_output\` with task_id="${task.id
+ // For category-based tasks, pass the model from category config
+ // For agent-based tasks, use agent's configured model (don't pass model in body)
+ let promptError: Error | undefined
+ client.session.prompt({
+ path: { id: sessionID },
+ body: {
</file context>
| if (promptError) { | ||
| if (toastManager && taskId !== undefined) { | ||
| toastManager.removeTask(taskId) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Prompt error early return leaves subagentSessions entry uncleared, causing stale session tracking
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/tools/sisyphus-task/tools.ts, line 445:
<comment>Prompt error early return leaves subagentSessions entry uncleared, causing stale session tracking</comment>
<file context>
@@ -419,21 +419,30 @@ System notifies on completion. Use \`background_output\` with task_id="${task.id
+ // Small delay to let the prompt start
+ await new Promise(resolve => setTimeout(resolve, 100))
+
+ if (promptError) {
if (toastManager && taskId !== undefined) {
toastManager.removeTask(taskId)
</file context>
|
adding dynamic approach for "noreply", updating tormorrow in new pr |
if you're making a new pr please close this one |
Summary
Complete overhaul of background agent stability, combining improvements from closed PRs #697, #655, and #714.
Changes
Agent Safety Guards (from #697)
max_steps: 25to prevent infinite loopsmax_steps: 30+ blockstask,sisyphus_task,call_omo_agenttoolsBackground Completion Detection (from #655, #697)
prompt()instead of awaiting (fixes JSON parse errors with thinking models)MIN_IDLE_TIME_MS(5s): Minimum time before considering session idleMAX_RUN_TIME_MS(15min): Global timeout to prevent stuck tasksvalidateSessionHasOutputandcheckSessionTodosguards that were causing stuck taskstimeoutTimeron task completionMemory Leak Prevention (from #714)
session.idlecompletion (not just onsession.deleted)concurrencyKey = undefinedafter releaseOther Fixes
categoryModelfor category-based tasks in sync mode.jsoncbefore.jsonstartedAton resume: Proper timing for resumed tasksTesting
Related PRs
References
Summary by cubic
Improves background agent stability and reliable completion by adding safety limits, simplifying idle handling, and fixing cleanup. Tasks now complete predictably, avoid leaks, and subagents can’t spiral.
Written for commit d59f890. Summary will update on new commits.