fix(debate): enforce 240s timeout on external tool invocations (#233)#240
fix(debate): enforce 240s timeout on external tool invocations (#233)#240
Conversation
Debate rounds could stall indefinitely waiting for Codex or other tool responses. Add explicit 240s timeout enforcement: - Add timeout constraint to command files (plugins + adapters) - Add timeout notes after each Skill:consult invocation in Phase 3b - Add "Tool invocation timeout (>240s)" row to all error handling tables - Update agent timeout from 120s to 240s - Add test coverage for timeout handling
…late, adapter tests - Add inline 240s timeout instruction at step 3b/3d in both orchestrator agents (plugins and opencode adapter), matching the pattern already used in command files - Restore missing Challenger Follow-up template in adapters/opencode/skills/debate/SKILL.md (was replaced with a non-functional stub comment) - Add 'All rounds timeout' error row to both command error tables (was in skill but not commands) - Add adapter file loading and consistency tests to debate-command.test.js: codex skill 240s, opencode command 240s + all-rounds-timeout, opencode agent inline timeout, opencode skill Challenger Follow-up completeness
…ives on natural language The gen-adapters transformer was replacing fenced code blocks containing 'let ' with '*(JavaScript reference - not executable in OpenCode)*' even when the block was plain prose (e.g. 'Do NOT let the proposer reframe'). The regex matched any occurrence of 'let ' including natural language usage. Fix: use /^\s*let\s+[a-zA-Z_$]/m to only match 'let' at line start followed by an identifier character — the pattern for actual JS variable declarations. This restores the full Challenger Follow-up template in the OpenCode debate skill and a knowledge guide template in the learn agent adapter.
…hestrator agents - Reword 'Set a 240-second timeout' to 'Track invocation start time. If >240s, treat as failure' to reflect what an agent can actually detect (post-hoc) vs aspirationally cancel - Add WHY to constraint: 'external tools can hang indefinitely, blocking remaining rounds' to improve LLM compliance with the timeout rule
Summary of ChangesHello @avifenesh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness of the debate workflow by introducing a mandatory 240-second timeout for all external tool interactions. This change addresses a critical issue where external tools could stall indefinitely, blocking the debate process. Alongside this, the PR refines timeout documentation, fixes a subtle regex bug affecting adapter templates, and adds comprehensive tests to ensure consistent timeout behavior across various components, ultimately improving the reliability and predictability of the debate system. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6d163b11ac
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Tests expected old exact phrases after enhance rewording: - 'MUST enforce 240s timeout per invocation' now matches /240s? timeout/i - Updated inline test to match 'Track invocation start time' pattern
There was a problem hiding this comment.
Pull request overview
This PR updates the debate workflow documentation/templates and adapter generation logic to (a) standardize a 240s per-invocation timeout policy for external tool calls and (b) prevent OpenCode adapter transforms from incorrectly treating natural-language “let …” as JavaScript, plus it adds tests to keep adapters consistent.
Changes:
- Add/standardize 240s timeout handling guidance across debate command/skill/agents and adapters, including “all rounds timeout” documentation.
- Fix OpenCode adapter transform false-positive by narrowing
letdetection to JS-style declarations only. - Add/extend
__tests__/debate-command.test.jsto validate adapter consistency and the restored Challenger Follow-up template.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
lib/adapter-transforms.js |
Narrows let detection to avoid dropping natural-language templates during OpenCode transform. |
plugins/sync-docs/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/ship/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/repo-map/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/perf/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/next-task/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/learn/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/enhance/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/drift-detect/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/deslop/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/debate/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/audit-project/lib/adapter-transforms.js |
Mirrors the let detection fix in plugin-local lib copy. |
plugins/debate/skills/debate/SKILL.md |
Documents tool-invocation timeout behavior in the error-handling table. |
plugins/debate/commands/debate.md |
Adds explicit 240s timeout enforcement guidance + error-table rows (incl. all-rounds-timeout). |
plugins/debate/agents/debate-orchestrator.md |
Updates orchestrator constraints/guidance to 240s per-invocation handling. |
adapters/opencode/skills/debate/SKILL.md |
Restores the full “Round 2+: Challenger Follow-up” template in the OpenCode adapter. |
adapters/opencode/commands/debate.md |
Brings OpenCode adapter command docs in line with 240s timeout + error-table updates. |
adapters/opencode/agents/debate-orchestrator.md |
Brings OpenCode adapter agent guidance in line with 240s timeout handling. |
adapters/opencode/agents/learn-agent.md |
Replaces an OpenCode stub with a concrete markdown template example (but introduces a fencing issue). |
adapters/codex/skills/debate/SKILL.md |
Brings Codex adapter skill docs in line with 240s timeout + all-rounds-timeout error. |
__tests__/debate-command.test.js |
Adds adapter consistency assertions for timeout + all-rounds-timeout + template completeness. |
CHANGELOG.md |
Records the debate timeout enforcement + OpenCode template restoration. |
.gitignore |
Ignores workflow-status.json. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
The pull request effectively implements the 240s timeout enforcement across the debate workflow and fixes a critical bug in the adapter transformation logic that was causing valid templates to be incorrectly replaced by placeholders. The transition to a regex-based detection for the let keyword is a significant improvement. However, there are a few areas where the implementation could be more robust, specifically regarding the detection of JavaScript destructuring and the consistency of the detection logic for other keywords like const. Additionally, a markdown syntax issue was identified in the learn-agent.md template, and the extensive code duplication across multiple adapter-transforms.js files remains a maintenance concern.
…ted markdown fences in learn-agent
- Extend adapter-transforms regex to also match const/let destructuring patterns
({ } and [ ] immediately after keyword) per Gemini review suggestion
- Fix nested markdown fences in plugins/learn/agents/learn-agent.md: use 4-backtick
outer fence around the guide template that contains inner 3-backtick code blocks
(fixes malformed rendering reported by Copilot and Gemini)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3c3c0c3a68
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| Args: "{proposer_prompt}" --tool=[proposer] --effort=[effort] [--model=[model_proposer]] [--context=[context]] | ||
| ``` | ||
|
|
||
| Set a 240-second timeout on this invocation. If it exceeds 240s, treat as a tool failure for this round. |
There was a problem hiding this comment.
Keep debate timeout aligned with consult skill limit
This change upgrades debate calls to a 240s timeout, but debate executions are still delegated to the consult skill, whose timeout path remains "Timeout after 120s" (plugins/consult/skills/consult/SKILL.md, Error Handling). In practice, any proposer/challenger call that runs between 120s and 240s will still fail early, so the new 240s behavior is not actually achievable through the current integration and can cause unexpected mid-round abort/synthesis in longer debates.
Useful? React with 👍 / 👎.
Summary
lib/adapter-transforms.jsfalse-positive: regexletmatched natural language (e.g. "Do NOT let the proposer") causing Challenger Follow-up template to be silently dropped from OpenCode adapter; narrowed to^\s*let\s+[a-zA-Z_$](JS declarations only)__tests__/debate-command.test.jsTest Plan
npm test)npm test -- --testPathPattern=debate(78 tests including new adapter consistency tests)npx agentsys-dev validatepasses all validatorsnpx agentsys-dev gen-adapters --checkpasses (adapters in sync)Related Issues
Closes #233