fix(runtime): answer every native tool call in a batch before the next completion#336
Conversation
…t completion Live-build finding (2026-06-12): when the model emitted several native tool_calls in one assistant message and an early call failed (or the run was cancelled mid-batch), the remaining calls never received a tool response. The session history then violated the provider protocol and the NEXT completion failed with 400 "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'" — killing the session and triggering recovery for a self-inflicted reason. _execute_native_tool_calls now tracks answered tool_call ids and, in a finally block, backfills every unanswered call with a structured not-executed result (ok=false, error_code=tool_call_not_executed). The invariant holds across all exits: batch-stopping failures, cancellation, and exceptions. Skipped calls are answered, never executed. Regression test: a two-call batch whose first call fails asserts the second call is answered with the backfill payload and its action never ran. 241 passed on the runtime suite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Oleg Miagkov <mrobenner@gmail.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds a backfill mechanism to the native tool-call execution path. The runtime now tracks answered tool-call IDs and automatically emits "not executed" results for any unanswered calls in the batch before the next completion, ensuring the protocol invariant holds even when early exits occur. ChangesTool-call result backfill for native tool execution
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|



Находка живой сборки №5 (баг A — протокол tool-calls)
Когда модель шлёт несколько native tool_calls в одном assistant-сообщении и ранний вызов падает (или прилетает cancel посреди батча), хвост вызовов оставался без tool-ответов. История сессии нарушала протокол провайдера, и следующий комплишен падал с 400:
assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'— сессия умирала по самонанесённой причине и зря дёргала recovery (наблюдалось в run-5:call_rfdJ…без ответа).Фикс
_execute_native_tool_callsведёт учёт отвеченных id и вfinallyбэкфиллит каждый неотвеченный вызов структурным not-executed результатом (ok=false,error_code=tool_call_not_executed). Инвариант держится на всех выходах: батч-стоп по фейлу, cancel, исключение. Пропущенные вызовы отвечаются, но не исполняются.Tests
Регрессионный тест: батч из двух вызовов, первый падает → второй получает бэкфилл-ответ, его экшен не исполнялся. 241 passed (полная runtime-сьюта), ruff clean.
Завершает четвёрку фиксов живой сборки: #333 (трансляция уровня), #334 (provider-aware fallback), #335 (пин провайдера + CI-подсказка), этот — протокол. После мержа перегоню живую сборку начисто.
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests