Skip to content

[FEATURE] Option to suppress intermediate text chunks during tool call streaming rounds #725

@Halvanhelv

Description

@Halvanhelv

Scope check

  • This is core LLM communication (not application logic)
  • This benefits most users (not just my use case)
  • This can't be solved in application code with current RubyLLM (see workaround limitations below)
  • I read the Contributing Guide

Due diligence

  • I searched existing issues
  • I checked the documentation

What problem does this solve?

When using streaming with tools, Claude models (and sometimes others) generate intermediate "thinking" text before calling a tool. For example:

chat = RubyLLM.chat(model: 'claude-opus-4-6').with_tool(MyTool)

chat.ask("What's in this document?") do |chunk|
  print chunk.content if chunk.content
end

Output:

Let me retrieve the content of the document.    ← intermediate text (unwanted)
[tool executes]
Based on the document, here is the summary...  ← final answer (wanted)

The streaming block receives chunks from all rounds — including the intermediate text from rounds that end with tool calls. The Streaming with Tools docs describe this as expected behavior.

In a web application (Rails + Turbo Streams), this intermediate text gets broadcast to the browser. For chat UIs, users see "Let me retrieve..." flash on screen before the final answer.

Workaround limitations

Using on_tool_call callback to detect and reset:

chat.on_tool_call { stream_handler.reset }

chat.ask(messages) do |chunk|
  stream_handler.call(chunk.content) if chunk.content
end

This has inherent limitations:

  • Flicker: on_tool_call fires after streaming completes for that round, so intermediate text is already broadcast to the client before the reset
  • No way to distinguish rounds: the streaming block has no context about which round it's in
  • on_end_message is post-hoc: by the time it fires with response.tool_call?, the text has already been yielded through the block

LiteLLM has the same issue.

Why this belongs in RubyLLM

  • This is a streaming behavior concern, not application logic
  • Every developer building a chat UI with tools faces this problem

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions