fix(debate): enforce 240s timeout on external tool invocations (#233) by avifenesh · Pull Request #240 · agent-sh/agentsys

avifenesh · 2026-02-20T10:12:28Z

Summary

Add 240s timeout enforcement to all debate entry points (command, adapters, agents) — previously the command had no timeout at all, allowing Codex and other external tools to stall indefinitely
Clarify timeout wording to be detectable/post-hoc rather than aspirational ("track invocation start time, if >240s treat as failure") and add WHY rationale to constraint
Fix lib/adapter-transforms.js false-positive: regex let matched natural language (e.g. "Do NOT let the proposer") causing Challenger Follow-up template to be silently dropped from OpenCode adapter; narrowed to ^\s*let\s+[a-zA-Z_$] (JS declarations only)
Add "All rounds timeout" error row to command error tables (was in skill but missing from commands)
Add adapter consistency tests to __tests__/debate-command.test.js

Test Plan

All 4041 tests pass (npm test)
Debate tests: npm test -- --testPathPattern=debate (78 tests including new adapter consistency tests)
New adapter tests verify: codex skill 240s, opencode command 240s + all-rounds-timeout, opencode agent inline timeout, opencode skill Challenger Follow-up completeness
npx agentsys-dev validate passes all validators
npx agentsys-dev gen-adapters --check passes (adapters in sync)

Related Issues

Closes #233

Debate rounds could stall indefinitely waiting for Codex or other tool responses. Add explicit 240s timeout enforcement: - Add timeout constraint to command files (plugins + adapters) - Add timeout notes after each Skill:consult invocation in Phase 3b - Add "Tool invocation timeout (>240s)" row to all error handling tables - Update agent timeout from 120s to 240s - Add test coverage for timeout handling

…late, adapter tests - Add inline 240s timeout instruction at step 3b/3d in both orchestrator agents (plugins and opencode adapter), matching the pattern already used in command files - Restore missing Challenger Follow-up template in adapters/opencode/skills/debate/SKILL.md (was replaced with a non-functional stub comment) - Add 'All rounds timeout' error row to both command error tables (was in skill but not commands) - Add adapter file loading and consistency tests to debate-command.test.js: codex skill 240s, opencode command 240s + all-rounds-timeout, opencode agent inline timeout, opencode skill Challenger Follow-up completeness

…low-status.json

…ives on natural language The gen-adapters transformer was replacing fenced code blocks containing 'let ' with '*(JavaScript reference - not executable in OpenCode)*' even when the block was plain prose (e.g. 'Do NOT let the proposer reframe'). The regex matched any occurrence of 'let ' including natural language usage. Fix: use /^\s*let\s+[a-zA-Z_$]/m to only match 'let' at line start followed by an identifier character — the pattern for actual JS variable declarations. This restores the full Challenger Follow-up template in the OpenCode debate skill and a knowledge guide template in the learn agent adapter.

…hestrator agents - Reword 'Set a 240-second timeout' to 'Track invocation start time. If >240s, treat as failure' to reflect what an agent can actually detect (post-hoc) vs aspirationally cancel - Add WHY to constraint: 'external tools can hang indefinitely, blocking remaining rounds' to improve LLM compliance with the timeout rule

gemini-code-assist · 2026-02-20T10:12:56Z

Summary of Changes

Hello @avifenesh, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the debate workflow by introducing a mandatory 240-second timeout for all external tool interactions. This change addresses a critical issue where external tools could stall indefinitely, blocking the debate process. Alongside this, the PR refines timeout documentation, fixes a subtle regex bug affecting adapter templates, and adds comprehensive tests to ensure consistent timeout behavior across various components, ultimately improving the reliability and predictability of the debate system.

Highlights

Timeout Enforcement: Implemented a hard 240-second timeout for all external tool invocations within the debate workflow, including commands, adapters, and agents. This prevents indefinite stalls from tools like Codex.
Timeout Wording Clarification: Updated documentation to clarify timeout handling as detectable/post-hoc ('track invocation start time, if >240s treat as failure') and added rationale for the constraint.
Regex Fix for let Declarations: Corrected a false-positive regex in lib/adapter-transforms.js that incorrectly matched natural language containing 'let', which was causing the Challenger Follow-up template to be dropped in the OpenCode adapter. The regex is now narrowed to match only JavaScript let declarations.
New Error Row for All Rounds Timeout: Added an 'All rounds timeout' error row to command error tables, ensuring this specific failure mode is properly documented and handled.
Adapter Consistency Tests: Introduced new tests to __tests__/debate-command.test.js to verify adapter consistency, specifically for timeout enforcement and template completeness across different adapters.

Changelog

CHANGELOG.md
- Added a new entry under 'Fixed' for the 240s timeout enforcement in the /debate workflow, detailing its impact on different debate stages and tools, and noting the restoration of the Challenger Follow-up template in the OpenCode adapter.

Activity

The pull request author provided a detailed summary of changes, including timeout enforcement, regex fix, and new tests.
A comprehensive test plan was outlined, covering existing tests, new adapter consistency tests, and validation steps.
The pull request addresses and closes issue bug: debate rounds can stall waiting for Codex response, timeout handling is insufficient #233.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d163b11ac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

__tests__/debate-command.test.js

Tests expected old exact phrases after enhance rewording: - 'MUST enforce 240s timeout per invocation' now matches /240s? timeout/i - Updated inline test to match 'Track invocation start time' pattern

Copilot

Pull request overview

This PR updates the debate workflow documentation/templates and adapter generation logic to (a) standardize a 240s per-invocation timeout policy for external tool calls and (b) prevent OpenCode adapter transforms from incorrectly treating natural-language “let …” as JavaScript, plus it adds tests to keep adapters consistent.

Changes:

Add/standardize 240s timeout handling guidance across debate command/skill/agents and adapters, including “all rounds timeout” documentation.
Fix OpenCode adapter transform false-positive by narrowing let detection to JS-style declarations only.
Add/extend __tests__/debate-command.test.js to validate adapter consistency and the restored Challenger Follow-up template.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`lib/adapter-transforms.js`	Narrows `let` detection to avoid dropping natural-language templates during OpenCode transform.
`plugins/sync-docs/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/ship/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/repo-map/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/perf/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/next-task/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/learn/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/enhance/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/drift-detect/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/deslop/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/debate/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/audit-project/lib/adapter-transforms.js`	Mirrors the `let` detection fix in plugin-local lib copy.
`plugins/debate/skills/debate/SKILL.md`	Documents tool-invocation timeout behavior in the error-handling table.
`plugins/debate/commands/debate.md`	Adds explicit 240s timeout enforcement guidance + error-table rows (incl. all-rounds-timeout).
`plugins/debate/agents/debate-orchestrator.md`	Updates orchestrator constraints/guidance to 240s per-invocation handling.
`adapters/opencode/skills/debate/SKILL.md`	Restores the full “Round 2+: Challenger Follow-up” template in the OpenCode adapter.
`adapters/opencode/commands/debate.md`	Brings OpenCode adapter command docs in line with 240s timeout + error-table updates.
`adapters/opencode/agents/debate-orchestrator.md`	Brings OpenCode adapter agent guidance in line with 240s timeout handling.
`adapters/opencode/agents/learn-agent.md`	Replaces an OpenCode stub with a concrete markdown template example (but introduces a fencing issue).
`adapters/codex/skills/debate/SKILL.md`	Brings Codex adapter skill docs in line with 240s timeout + all-rounds-timeout error.
`__tests__/debate-command.test.js`	Adds adapter consistency assertions for timeout + all-rounds-timeout + template completeness.
`CHANGELOG.md`	Records the debate timeout enforcement + OpenCode template restoration.
`.gitignore`	Ignores `workflow-status.json`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

adapters/opencode/agents/learn-agent.md

__tests__/debate-command.test.js

gemini-code-assist

Code Review

The pull request effectively implements the 240s timeout enforcement across the debate workflow and fixes a critical bug in the adapter transformation logic that was causing valid templates to be incorrectly replaced by placeholders. The transition to a regex-based detection for the let keyword is a significant improvement. However, there are a few areas where the implementation could be more robust, specifically regarding the detection of JavaScript destructuring and the consistency of the detection logic for other keywords like const. Additionally, a markdown syntax issue was identified in the learn-agent.md template, and the extensive code duplication across multiple adapter-transforms.js files remains a maintenance concern.

lib/adapter-transforms.js

adapters/opencode/agents/learn-agent.md

…ted markdown fences in learn-agent - Extend adapter-transforms regex to also match const/let destructuring patterns ({ } and [ ] immediately after keyword) per Gemini review suggestion - Fix nested markdown fences in plugins/learn/agents/learn-agent.md: use 4-backtick outer fence around the guide template that contains inner 3-backtick code blocks (fixes malformed rendering reported by Copilot and Gemini)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c3c0c3a68

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-20T10:30:41Z

plugins/debate/commands/debate.md

 Args: "{proposer_prompt}" --tool=[proposer] --effort=[effort] [--model=[model_proposer]] [--context=[context]]
 ```

+Set a 240-second timeout on this invocation. If it exceeds 240s, treat as a tool failure for this round.


Keep debate timeout aligned with consult skill limit

This change upgrades debate calls to a 240s timeout, but debate executions are still delegated to the consult skill, whose timeout path remains "Timeout after 120s" (plugins/consult/skills/consult/SKILL.md, Error Handling). In practice, any proposer/challenger call that runs between 120s and 240s will still fail early, so the new 240s behavior is not actually achievable through the current integration and can cause unexpected mid-round abort/synthesis in longer debates.

Useful? React with 👍 / 👎.

avifenesh added 6 commits February 20, 2026 11:37

fix(debate): add all-rounds-timeout to codex adapter, gitignore workf…

3ddf1f0

…low-status.json

docs: update CHANGELOG for debate timeout fix (#233)

d0bfab8

Copilot AI review requested due to automatic review settings February 20, 2026 10:12

Copilot started reviewing on behalf of avifenesh February 20, 2026 10:13 View session

chatgpt-codex-connector bot reviewed Feb 20, 2026

View reviewed changes

__tests__/debate-command.test.js Show resolved Hide resolved

__tests__/debate-command.test.js Outdated Show resolved Hide resolved

fix(tests): align test patterns with enhanced timeout wording

54c1090

Tests expected old exact phrases after enhance rewording: - 'MUST enforce 240s timeout per invocation' now matches /240s? timeout/i - Updated inline test to match 'Track invocation start time' pattern

Copilot AI reviewed Feb 20, 2026

View reviewed changes

adapters/opencode/agents/learn-agent.md Show resolved Hide resolved

__tests__/debate-command.test.js Show resolved Hide resolved

__tests__/debate-command.test.js Show resolved Hide resolved

gemini-code-assist bot reviewed Feb 20, 2026

View reviewed changes

lib/adapter-transforms.js Outdated Show resolved Hide resolved

adapters/opencode/agents/learn-agent.md Outdated Show resolved Hide resolved

avifenesh merged commit d8505f5 into main Feb 20, 2026
20 checks passed

avifenesh deleted the fix/debate-timeout-233 branch February 20, 2026 10:29

avifenesh mentioned this pull request Feb 20, 2026

bug: debate rounds can stall waiting for Codex response, timeout handling is insufficient #233

Closed

chatgpt-codex-connector bot reviewed Feb 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(debate): enforce 240s timeout on external tool invocations (#233)#240

fix(debate): enforce 240s timeout on external tool invocations (#233)#240
avifenesh merged 8 commits intomainfrom
fix/debate-timeout-233

avifenesh commented Feb 20, 2026

Uh oh!

gemini-code-assist bot commented Feb 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

avifenesh commented Feb 20, 2026

Summary

Test Plan

Related Issues

Uh oh!

gemini-code-assist bot commented Feb 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants