Skip to content

feat(forwarder): query SQL tool + OpenAI tool-use loop (#415)#426

Open
jonathaneoliver wants to merge 1 commit into
feat/llm-client-414from
feat/llm-query-tool-415
Open

feat(forwarder): query SQL tool + OpenAI tool-use loop (#415)#426
jonathaneoliver wants to merge 1 commit into
feat/llm-client-414from
feat/llm-query-tool-415

Conversation

@jonathaneoliver
Copy link
Copy Markdown
Owner

Summary

What's in the box

  • llm_tool_query.goQueryTool struct + Run(ctx, sql) *QueryToolResult + OpenAITool() schema. Hits ClickHouse via the same HTTP transport the forwarder already uses for inserts but as the constrained llm_reader identity. Forwarder enforces hard 10k-row / 10MB cap on response read because ClickHouse 24.8's max_result_rows is cooperative for streaming SELECTs (the real server fence is max_rows_to_read on input).
  • llm_chat.goRunChatTurn(ctx, profile, messages, opts) (*ChatTurnResult, error). Implements the OpenAI tool-use loop: model emits tool_calls → forwarder runs each via dispatcher → appends role:"tool" results → re-invokes → stops when model returns no tool_calls (final answer). Iteration cap from LLM_MAX_TOOL_CALLS_PER_TURN env (default 20). Per-iteration timeout 30s, aggregate turn timeout 120s.
  • MakeQueryDispatcher(qt) — maps the query tool name through to QueryTool; rejects unknown names. The same dispatcher pattern will pick up the read-only control tools in forwarder + dashboard: read-only session control tools (get_session_state, get_failure_settings, …) #424.

Notable design choice — tool-budget exhaustion

Initial implementation just synthesized a "budget exhausted" tool-result message and let the loop continue. Pre-commit code-reviewer caught that the model could then keep emitting tool_calls right up to the iteration cap and never produce text — the wrap-up was aspirational, not enforced.

Fix: when the budget is blown, the forwarder sets ToolChoice: "none" on the next request, forcing the model to produce final text. If the model still ignores the directive (some providers don't honor it), we bail out with StoppedReason="tool_budget" and surface whatever content arrived. New body-inspection test (TestRunChatTurn_ToolBudgetForcesToolChoiceNone) verifies the gate actually fires by checking the JSON the scripted LLM received.

Other pre-commit-review fixes

  • Defensive guard for finish_reason: "tool_calls" with an empty ToolCalls array (known provider bug) — returns StoppedReason="error" with "empty assistant message" detail.
  • MarshalForTool's overflow path produced invalid JSON (...\"]}); replaced with a clean {"error":"result exceeded N bytes","truncated":true,"elapsed_ms":N} literal. Should never fire in practice (10k-row cap upstream), but if it does the LLM gets parseable data.
  • Trimmed several "what" comments per the CLAUDE.md / memory rule. Kept the "why" comments and the cross-issue pointers.
  • Test ID generator switched from '0'+i (silently breaks past i=9) to strconv.Itoa(i).

Test plan

  • go test ./... — 33/33 pass.
  • Tool-budget loop terminates with FinalText after exhaustion (verified by behavior test).
  • tool_choice: "none" actually fires on the wrap-up request (verified by body-inspection test).
  • QueryTool truncates at exactly 10000 rows when ClickHouse returns more.
  • Outer context.Cancel propagates into QueryTool (sub-second response).
  • Dispatcher errors round-trip back to the model without stopping the loop.
  • Live-network smoke against a real ClickHouse + real LLM — manual / deploy-time, not in unit-test scope.

🤖 Generated with Claude Code

Adds the SQL `query` tool and the multi-step tool-use loop on top of
the LLM client from #414. The LLM authors SQL via the tool, the
forwarder runs it as the locked-down `llm_reader` ClickHouse user
(#413), and the loop drives multi-iteration interactions where the
model emits tool_calls, the forwarder runs them, appends results as
role:"tool" messages, and re-invokes until the model produces a
final answer.

Files:
  - llm_tool_query.go      — QueryTool struct + Run + OpenAITool schema.
                             Forwarder enforces hard 10k-row / 10MB
                             cap on response read because ClickHouse
                             24.8's max_result_rows is cooperative
                             for streaming SELECTs (real fence is
                             max_rows_to_read on input side).
  - llm_chat.go            — RunChatTurn(ctx, profile, msgs, opts).
                             Iteration cap (LLM_MAX_TOOL_CALLS_PER_TURN
                             env, default 20), per-iteration timeout
                             30s, aggregate turn timeout 120s. When
                             the budget is exhausted, forces
                             ToolChoice="none" on the next request
                             so the model is required to produce a
                             final answer instead of looping
                             indefinitely.
  - llm_chat_test.go       — 11 tests including a body-inspection
                             test that verifies tool_choice="none"
                             actually fires on the wrap-up iteration.
  - llm_tool_query_test.go — 7 tests including 12k-row truncation,
                             ctx cancellation propagation, malformed
                             JSON handling.

Verified: go test ./... 33/33 pass.

Notable design choice — tool budget exhaustion: when the model
exceeds maxCalls tool calls, the forwarder synthesizes a "budget
exhausted" tool-result message AND sets ToolChoice="none" on the
next request. Without ToolChoice="none" the model can keep emitting
tool_calls right up to the iteration cap and never produce text;
with it, the model is forced to give a final answer using what it
has. Defensive guard added for empty assistant messages
(empty content + empty tool_calls + finish_reason="tool_calls" — a
known provider bug).

Part of epic #412.
Closes #415.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant