Skip to content

fix(debate): enforce 240s timeout on external tool invocations (#233)#240

Merged
avifenesh merged 8 commits intomainfrom
fix/debate-timeout-233
Feb 20, 2026
Merged

fix(debate): enforce 240s timeout on external tool invocations (#233)#240
avifenesh merged 8 commits intomainfrom
fix/debate-timeout-233

Conversation

@avifenesh
Copy link
Copy Markdown
Collaborator

Summary

  • Add 240s timeout enforcement to all debate entry points (command, adapters, agents) — previously the command had no timeout at all, allowing Codex and other external tools to stall indefinitely
  • Clarify timeout wording to be detectable/post-hoc rather than aspirational ("track invocation start time, if >240s treat as failure") and add WHY rationale to constraint
  • Fix lib/adapter-transforms.js false-positive: regex let matched natural language (e.g. "Do NOT let the proposer") causing Challenger Follow-up template to be silently dropped from OpenCode adapter; narrowed to ^\s*let\s+[a-zA-Z_$] (JS declarations only)
  • Add "All rounds timeout" error row to command error tables (was in skill but missing from commands)
  • Add adapter consistency tests to __tests__/debate-command.test.js

Test Plan

  • All 4041 tests pass (npm test)
  • Debate tests: npm test -- --testPathPattern=debate (78 tests including new adapter consistency tests)
  • New adapter tests verify: codex skill 240s, opencode command 240s + all-rounds-timeout, opencode agent inline timeout, opencode skill Challenger Follow-up completeness
  • npx agentsys-dev validate passes all validators
  • npx agentsys-dev gen-adapters --check passes (adapters in sync)

Related Issues

Closes #233

Debate rounds could stall indefinitely waiting for Codex or other tool
responses. Add explicit 240s timeout enforcement:

- Add timeout constraint to command files (plugins + adapters)
- Add timeout notes after each Skill:consult invocation in Phase 3b
- Add "Tool invocation timeout (>240s)" row to all error handling tables
- Update agent timeout from 120s to 240s
- Add test coverage for timeout handling
…late, adapter tests

- Add inline 240s timeout instruction at step 3b/3d in both orchestrator agents
  (plugins and opencode adapter), matching the pattern already used in command files
- Restore missing Challenger Follow-up template in adapters/opencode/skills/debate/SKILL.md
  (was replaced with a non-functional stub comment)
- Add 'All rounds timeout' error row to both command error tables (was in skill but not commands)
- Add adapter file loading and consistency tests to debate-command.test.js:
  codex skill 240s, opencode command 240s + all-rounds-timeout, opencode agent inline timeout,
  opencode skill Challenger Follow-up completeness
…ives on natural language

The gen-adapters transformer was replacing fenced code blocks containing 'let '
with '*(JavaScript reference - not executable in OpenCode)*' even when the block
was plain prose (e.g. 'Do NOT let the proposer reframe'). The regex matched any
occurrence of 'let ' including natural language usage.

Fix: use /^\s*let\s+[a-zA-Z_$]/m to only match 'let' at line start followed by
an identifier character — the pattern for actual JS variable declarations.

This restores the full Challenger Follow-up template in the OpenCode debate skill
and a knowledge guide template in the learn agent adapter.
…hestrator agents

- Reword 'Set a 240-second timeout' to 'Track invocation start time. If >240s, treat as failure'
  to reflect what an agent can actually detect (post-hoc) vs aspirationally cancel
- Add WHY to constraint: 'external tools can hang indefinitely, blocking remaining rounds'
  to improve LLM compliance with the timeout rule
Copilot AI review requested due to automatic review settings February 20, 2026 10:12
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @avifenesh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the debate workflow by introducing a mandatory 240-second timeout for all external tool interactions. This change addresses a critical issue where external tools could stall indefinitely, blocking the debate process. Alongside this, the PR refines timeout documentation, fixes a subtle regex bug affecting adapter templates, and adds comprehensive tests to ensure consistent timeout behavior across various components, ultimately improving the reliability and predictability of the debate system.

Highlights

  • Timeout Enforcement: Implemented a hard 240-second timeout for all external tool invocations within the debate workflow, including commands, adapters, and agents. This prevents indefinite stalls from tools like Codex.
  • Timeout Wording Clarification: Updated documentation to clarify timeout handling as detectable/post-hoc ('track invocation start time, if >240s treat as failure') and added rationale for the constraint.
  • Regex Fix for let Declarations: Corrected a false-positive regex in lib/adapter-transforms.js that incorrectly matched natural language containing 'let', which was causing the Challenger Follow-up template to be dropped in the OpenCode adapter. The regex is now narrowed to match only JavaScript let declarations.
  • New Error Row for All Rounds Timeout: Added an 'All rounds timeout' error row to command error tables, ensuring this specific failure mode is properly documented and handled.
  • Adapter Consistency Tests: Introduced new tests to __tests__/debate-command.test.js to verify adapter consistency, specifically for timeout enforcement and template completeness across different adapters.
Changelog
  • CHANGELOG.md
    • Added a new entry under 'Fixed' for the 240s timeout enforcement in the /debate workflow, detailing its impact on different debate stages and tools, and noting the restoration of the Challenger Follow-up template in the OpenCode adapter.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d163b11ac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Tests expected old exact phrases after enhance rewording:
- 'MUST enforce 240s timeout per invocation' now matches /240s? timeout/i
- Updated inline test to match 'Track invocation start time' pattern
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the debate workflow documentation/templates and adapter generation logic to (a) standardize a 240s per-invocation timeout policy for external tool calls and (b) prevent OpenCode adapter transforms from incorrectly treating natural-language “let …” as JavaScript, plus it adds tests to keep adapters consistent.

Changes:

  • Add/standardize 240s timeout handling guidance across debate command/skill/agents and adapters, including “all rounds timeout” documentation.
  • Fix OpenCode adapter transform false-positive by narrowing let detection to JS-style declarations only.
  • Add/extend __tests__/debate-command.test.js to validate adapter consistency and the restored Challenger Follow-up template.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
lib/adapter-transforms.js Narrows let detection to avoid dropping natural-language templates during OpenCode transform.
plugins/sync-docs/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/ship/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/repo-map/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/perf/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/next-task/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/learn/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/enhance/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/drift-detect/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/deslop/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/debate/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/audit-project/lib/adapter-transforms.js Mirrors the let detection fix in plugin-local lib copy.
plugins/debate/skills/debate/SKILL.md Documents tool-invocation timeout behavior in the error-handling table.
plugins/debate/commands/debate.md Adds explicit 240s timeout enforcement guidance + error-table rows (incl. all-rounds-timeout).
plugins/debate/agents/debate-orchestrator.md Updates orchestrator constraints/guidance to 240s per-invocation handling.
adapters/opencode/skills/debate/SKILL.md Restores the full “Round 2+: Challenger Follow-up” template in the OpenCode adapter.
adapters/opencode/commands/debate.md Brings OpenCode adapter command docs in line with 240s timeout + error-table updates.
adapters/opencode/agents/debate-orchestrator.md Brings OpenCode adapter agent guidance in line with 240s timeout handling.
adapters/opencode/agents/learn-agent.md Replaces an OpenCode stub with a concrete markdown template example (but introduces a fencing issue).
adapters/codex/skills/debate/SKILL.md Brings Codex adapter skill docs in line with 240s timeout + all-rounds-timeout error.
__tests__/debate-command.test.js Adds adapter consistency assertions for timeout + all-rounds-timeout + template completeness.
CHANGELOG.md Records the debate timeout enforcement + OpenCode template restoration.
.gitignore Ignores workflow-status.json.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively implements the 240s timeout enforcement across the debate workflow and fixes a critical bug in the adapter transformation logic that was causing valid templates to be incorrectly replaced by placeholders. The transition to a regex-based detection for the let keyword is a significant improvement. However, there are a few areas where the implementation could be more robust, specifically regarding the detection of JavaScript destructuring and the consistency of the detection logic for other keywords like const. Additionally, a markdown syntax issue was identified in the learn-agent.md template, and the extensive code duplication across multiple adapter-transforms.js files remains a maintenance concern.

…ted markdown fences in learn-agent

- Extend adapter-transforms regex to also match const/let destructuring patterns
  ({ } and [ ] immediately after keyword) per Gemini review suggestion
- Fix nested markdown fences in plugins/learn/agents/learn-agent.md: use 4-backtick
  outer fence around the guide template that contains inner 3-backtick code blocks
  (fixes malformed rendering reported by Copilot and Gemini)
@avifenesh avifenesh merged commit d8505f5 into main Feb 20, 2026
20 checks passed
@avifenesh avifenesh deleted the fix/debate-timeout-233 branch February 20, 2026 10:29
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c3c0c3a68

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Args: "{proposer_prompt}" --tool=[proposer] --effort=[effort] [--model=[model_proposer]] [--context=[context]]
```

Set a 240-second timeout on this invocation. If it exceeds 240s, treat as a tool failure for this round.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep debate timeout aligned with consult skill limit

This change upgrades debate calls to a 240s timeout, but debate executions are still delegated to the consult skill, whose timeout path remains "Timeout after 120s" (plugins/consult/skills/consult/SKILL.md, Error Handling). In practice, any proposer/challenger call that runs between 120s and 240s will still fail early, so the new 240s behavior is not actually achievable through the current integration and can cause unexpected mid-round abort/synthesis in longer debates.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: debate rounds can stall waiting for Codex response, timeout handling is insufficient

2 participants