Skip to content

fix(runtime): answer every native tool call in a batch before the next completion#336

Merged
OBenner merged 1 commit into
developfrom
fix/native-tool-call-backfill
Jun 12, 2026
Merged

fix(runtime): answer every native tool call in a batch before the next completion#336
OBenner merged 1 commit into
developfrom
fix/native-tool-call-backfill

Conversation

@OBenner

@OBenner OBenner commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Находка живой сборки №5 (баг A — протокол tool-calls)

Когда модель шлёт несколько native tool_calls в одном assistant-сообщении и ранний вызов падает (или прилетает cancel посреди батча), хвост вызовов оставался без tool-ответов. История сессии нарушала протокол провайдера, и следующий комплишен падал с 400: assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id' — сессия умирала по самонанесённой причине и зря дёргала recovery (наблюдалось в run-5: call_rfdJ… без ответа).

Фикс

_execute_native_tool_calls ведёт учёт отвеченных id и в finally бэкфиллит каждый неотвеченный вызов структурным not-executed результатом (ok=false, error_code=tool_call_not_executed). Инвариант держится на всех выходах: батч-стоп по фейлу, cancel, исключение. Пропущенные вызовы отвечаются, но не исполняются.

Tests

Регрессионный тест: батч из двух вызовов, первый падает → второй получает бэкфилл-ответ, его экшен не исполнялся. 241 passed (полная runtime-сьюта), ruff clean.

Завершает четвёрку фиксов живой сборки: #333 (трансляция уровня), #334 (provider-aware fallback), #335 (пин провайдера + CI-подсказка), этот — протокол. После мержа перегоню живую сборку начисто.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Fixed tool call result tracking to ensure all tool calls receive corresponding completion results, even when execution is interrupted or fails early.
  • Tests

    • Added test coverage for tool result handling in batch execution scenarios.

…t completion

Live-build finding (2026-06-12): when the model emitted several native
tool_calls in one assistant message and an early call failed (or the run
was cancelled mid-batch), the remaining calls never received a tool
response. The session history then violated the provider protocol and
the NEXT completion failed with 400 "An assistant message with
'tool_calls' must be followed by tool messages responding to each
'tool_call_id'" — killing the session and triggering recovery for a
self-inflicted reason.

_execute_native_tool_calls now tracks answered tool_call ids and, in a
finally block, backfills every unanswered call with a structured
not-executed result (ok=false, error_code=tool_call_not_executed). The
invariant holds across all exits: batch-stopping failures, cancellation,
and exceptions. Skipped calls are answered, never executed.

Regression test: a two-call batch whose first call fails asserts the
second call is answered with the backfill payload and its action never
ran. 241 passed on the runtime suite.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Oleg Miagkov <mrobenner@gmail.com>
@github-actions github-actions Bot added area/backend bug Something isn't working size/M labels Jun 12, 2026
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3e6fe035-c966-471c-a7f4-fbd101415406

📥 Commits

Reviewing files that changed from the base of the PR and between f960903 and 3685b63.

📒 Files selected for processing (2)
  • apps/backend/agents/runtime/adapters/generic_edit.py
  • tests/test_agent_runtime.py

📝 Walkthrough

Walkthrough

This PR adds a backfill mechanism to the native tool-call execution path. The runtime now tracks answered tool-call IDs and automatically emits "not executed" results for any unanswered calls in the batch before the next completion, ensuring the protocol invariant holds even when early exits occur.

Changes

Tool-call result backfill for native tool execution

Layer / File(s) Summary
Tool-call backfill mechanism
apps/backend/agents/runtime/adapters/generic_edit.py
_execute_native_tool_actions initializes answered_tool_call_ids set, delegates tool loop execution, and uses a finally block to backfill add_tool_result entries for unanswered IDs. During execution in _run_native_tool_call_loop, non-empty tool_call_id values are added to the set as they are answered.
Backfill verification test
tests/test_agent_runtime.py
New test test_generic_edit_native_tools_backfill_unanswered_batch_calls confirms the runtime backfills results for unanswered tool calls when earlier calls fail, marking them with ok=False and error_code="tool_call_not_executed".

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A rabbit hops through tool-call land,
Where answers must be close at hand—
No tool left waiting, lost, alone,
The backfill ensures each one's known! 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main fix: ensuring every native tool call in a batch receives a response before the next completion.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/native-tool-call-backfill

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud

Copy link
Copy Markdown

@OBenner OBenner merged commit fd96b5c into develop Jun 12, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/backend bug Something isn't working size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant