fix(runtime): answer every native tool call in a batch before the next completion by OBenner · Pull Request #336 · OBenner/Auto-Coding

OBenner · 2026-06-12T18:20:46Z

Находка живой сборки №5 (баг A — протокол tool-calls)

Когда модель шлёт несколько native tool_calls в одном assistant-сообщении и ранний вызов падает (или прилетает cancel посреди батча), хвост вызовов оставался без tool-ответов. История сессии нарушала протокол провайдера, и следующий комплишен падал с 400: assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id' — сессия умирала по самонанесённой причине и зря дёргала recovery (наблюдалось в run-5: call_rfdJ… без ответа).

Фикс

_execute_native_tool_calls ведёт учёт отвеченных id и в finally бэкфиллит каждый неотвеченный вызов структурным not-executed результатом (ok=false, error_code=tool_call_not_executed). Инвариант держится на всех выходах: батч-стоп по фейлу, cancel, исключение. Пропущенные вызовы отвечаются, но не исполняются.

Tests

Регрессионный тест: батч из двух вызовов, первый падает → второй получает бэкфилл-ответ, его экшен не исполнялся. 241 passed (полная runtime-сьюта), ruff clean.

Завершает четвёрку фиксов живой сборки: #333 (трансляция уровня), #334 (provider-aware fallback), #335 (пин провайдера + CI-подсказка), этот — протокол. После мержа перегоню живую сборку начисто.

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Fixed tool call result tracking to ensure all tool calls receive corresponding completion results, even when execution is interrupted or fails early.
Tests
- Added test coverage for tool result handling in batch execution scenarios.

…t completion Live-build finding (2026-06-12): when the model emitted several native tool_calls in one assistant message and an early call failed (or the run was cancelled mid-batch), the remaining calls never received a tool response. The session history then violated the provider protocol and the NEXT completion failed with 400 "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'" — killing the session and triggering recovery for a self-inflicted reason. _execute_native_tool_calls now tracks answered tool_call ids and, in a finally block, backfills every unanswered call with a structured not-executed result (ok=false, error_code=tool_call_not_executed). The invariant holds across all exits: batch-stopping failures, cancellation, and exceptions. Skipped calls are answered, never executed. Regression test: a two-call batch whose first call fails asserts the second call is answered with the backfill payload and its action never ran. 241 passed on the runtime suite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Oleg Miagkov <mrobenner@gmail.com>

coderabbitai · 2026-06-12T18:21:01Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3e6fe035-c966-471c-a7f4-fbd101415406

📥 Commits

Reviewing files that changed from the base of the PR and between f960903 and 3685b63.

📒 Files selected for processing (2)

apps/backend/agents/runtime/adapters/generic_edit.py
tests/test_agent_runtime.py

📝 Walkthrough

Walkthrough

This PR adds a backfill mechanism to the native tool-call execution path. The runtime now tracks answered tool-call IDs and automatically emits "not executed" results for any unanswered calls in the batch before the next completion, ensuring the protocol invariant holds even when early exits occur.

Changes

Tool-call result backfill for native tool execution

Layer / File(s)	Summary
Tool-call backfill mechanism `apps/backend/agents/runtime/adapters/generic_edit.py`	`_execute_native_tool_actions` initializes `answered_tool_call_ids` set, delegates tool loop execution, and uses a finally block to backfill `add_tool_result` entries for unanswered IDs. During execution in `_run_native_tool_call_loop`, non-empty `tool_call_id` values are added to the set as they are answered.
Backfill verification test `tests/test_agent_runtime.py`	New test `test_generic_edit_native_tools_backfill_unanswered_batch_calls` confirms the runtime backfills results for unanswered tool calls when earlier calls fail, marking them with `ok=False` and `error_code="tool_call_not_executed"`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A rabbit hops through tool-call land,
Where answers must be close at hand—
No tool left waiting, lost, alone,
The backfill ensures each one's known! 🐰✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main fix: ensuring every native tool call in a batch receives a response before the next completion.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/native-tool-call-backfill

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sonarqubecloud · 2026-06-12T18:22:46Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions Bot added area/backend bug Something isn't working size/M labels Jun 12, 2026

OBenner merged commit fd96b5c into develop Jun 12, 2026
20 checks passed

OBenner mentioned this pull request Jun 12, 2026

fix(coder): replace claude-family phase default with the provider's own model #337

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(runtime): answer every native tool call in a batch before the next completion#336

fix(runtime): answer every native tool call in a batch before the next completion#336
OBenner merged 1 commit into
developfrom
fix/native-tool-call-backfill

OBenner commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

sonarqubecloud Bot commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

OBenner commented Jun 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Находка живой сборки №5 (баг A — протокол tool-calls)

Фикс

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

sonarqubecloud Bot commented Jun 12, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

OBenner commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading