feat(forwarder): query SQL tool + OpenAI tool-use loop (#415)#426
Open
jonathaneoliver wants to merge 1 commit into
Open
feat(forwarder): query SQL tool + OpenAI tool-use loop (#415)#426jonathaneoliver wants to merge 1 commit into
jonathaneoliver wants to merge 1 commit into
Conversation
Adds the SQL `query` tool and the multi-step tool-use loop on top of the LLM client from #414. The LLM authors SQL via the tool, the forwarder runs it as the locked-down `llm_reader` ClickHouse user (#413), and the loop drives multi-iteration interactions where the model emits tool_calls, the forwarder runs them, appends results as role:"tool" messages, and re-invokes until the model produces a final answer. Files: - llm_tool_query.go — QueryTool struct + Run + OpenAITool schema. Forwarder enforces hard 10k-row / 10MB cap on response read because ClickHouse 24.8's max_result_rows is cooperative for streaming SELECTs (real fence is max_rows_to_read on input side). - llm_chat.go — RunChatTurn(ctx, profile, msgs, opts). Iteration cap (LLM_MAX_TOOL_CALLS_PER_TURN env, default 20), per-iteration timeout 30s, aggregate turn timeout 120s. When the budget is exhausted, forces ToolChoice="none" on the next request so the model is required to produce a final answer instead of looping indefinitely. - llm_chat_test.go — 11 tests including a body-inspection test that verifies tool_choice="none" actually fires on the wrap-up iteration. - llm_tool_query_test.go — 7 tests including 12k-row truncation, ctx cancellation propagation, malformed JSON handling. Verified: go test ./... 33/33 pass. Notable design choice — tool budget exhaustion: when the model exceeds maxCalls tool calls, the forwarder synthesizes a "budget exhausted" tool-result message AND sets ToolChoice="none" on the next request. Without ToolChoice="none" the model can keep emitting tool_calls right up to the iteration cap and never produce text; with it, the model is forced to give a final answer using what it has. Defensive guard added for empty assistant messages (empty content + empty tool_calls + finish_reason="tool_calls" — a known provider bug). Part of epic #412. Closes #415. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
querytool that the LLM uses to read archived sessions, and the multi-step tool-use loop that drives multi-iteration interactions.Closes #415.mainautomatically.What's in the box
llm_tool_query.go—QueryToolstruct +Run(ctx, sql) *QueryToolResult+OpenAITool()schema. Hits ClickHouse via the same HTTP transport the forwarder already uses for inserts but as the constrainedllm_readeridentity. Forwarder enforces hard 10k-row / 10MB cap on response read because ClickHouse 24.8'smax_result_rowsis cooperative for streaming SELECTs (the real server fence ismax_rows_to_readon input).llm_chat.go—RunChatTurn(ctx, profile, messages, opts) (*ChatTurnResult, error). Implements the OpenAI tool-use loop: model emitstool_calls→ forwarder runs each via dispatcher → appendsrole:"tool"results → re-invokes → stops when model returns no tool_calls (final answer). Iteration cap fromLLM_MAX_TOOL_CALLS_PER_TURNenv (default 20). Per-iteration timeout 30s, aggregate turn timeout 120s.MakeQueryDispatcher(qt)— maps thequerytool name through to QueryTool; rejects unknown names. The same dispatcher pattern will pick up the read-only control tools in forwarder + dashboard: read-only session control tools (get_session_state, get_failure_settings, …) #424.Notable design choice — tool-budget exhaustion
Initial implementation just synthesized a "budget exhausted" tool-result message and let the loop continue. Pre-commit code-reviewer caught that the model could then keep emitting
tool_callsright up to the iteration cap and never produce text — the wrap-up was aspirational, not enforced.Fix: when the budget is blown, the forwarder sets
ToolChoice: "none"on the next request, forcing the model to produce final text. If the model still ignores the directive (some providers don't honor it), we bail out withStoppedReason="tool_budget"and surface whatever content arrived. New body-inspection test (TestRunChatTurn_ToolBudgetForcesToolChoiceNone) verifies the gate actually fires by checking the JSON the scripted LLM received.Other pre-commit-review fixes
finish_reason: "tool_calls"with an emptyToolCallsarray (known provider bug) — returnsStoppedReason="error"with"empty assistant message"detail.MarshalForTool's overflow path produced invalid JSON (...\"]}); replaced with a clean{"error":"result exceeded N bytes","truncated":true,"elapsed_ms":N}literal. Should never fire in practice (10k-row cap upstream), but if it does the LLM gets parseable data.'0'+i(silently breaks past i=9) tostrconv.Itoa(i).Test plan
go test ./...— 33/33 pass.FinalTextafter exhaustion (verified by behavior test).tool_choice: "none"actually fires on the wrap-up request (verified by body-inspection test).context.Cancelpropagates into QueryTool (sub-second response).🤖 Generated with Claude Code