Skip to content

Stream responses#29

Merged
sonic182 merged 14 commits intomasterfrom
feature/stream_responses
Jul 31, 2025
Merged

Stream responses#29
sonic182 merged 14 commits intomasterfrom
feature/stream_responses

Conversation

@sonic182
Copy link
Member

@sonic182 sonic182 commented Jul 29, 2025

Stream responses from providers

Requires the usage of Finch adapter, maybe other Tesla adapters works

for now:

  • OpenAi
  • OpenRouter
  • Ollama

Copy link
Contributor

@dmoralesl dmoralesl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks clean and good. But I have a couple of questions about how this works:

  • The stream option and the structured_output feature are compatible? How they can work together? If it is not, how this feature helps the AIAssistant to provide real-time outputs?
  • What will happen when a batch of the streaming response fail? The response will have a missing part or the whole request will fail? Is it possible what I'm saying or the providers (Ollama, Openrouter...) already handle these cases for us?

@sonic182
Copy link
Member Author

  • The stream option and the structured_output feature are compatible? How they can work together? If it is not, how this feature helps the AIAssistant to provide real-time outputs?

Yes, it is compatible, it is a bit problematic to handle it but it is allowed (see https://openrouter.ai/docs/features/structured-outputs#streaming-with-structured-outputs)

  • What will happen when a batch of the streaming response fail? The response will have a missing part or the whole request will fail? Is it possible what I'm saying or the providers (Ollama, Openrouter...) already handle these cases for us?

I don't understand the "batch of streaming response", you mean that the connection may get lost while receiving data? if so, the chat will have the text it could read

@dmoralesl
Copy link
Contributor

I don't understand the "batch of streaming response", you mean that the connection may get lost while receiving data? if so, the chat will have the text it could read

I mean, each "batch", "iteration", "part" (name it as you like) of the streaming contains a part of the response. Lets imagine the response is fulfilled by 3 iterations ["This is", "a full", "response"] and the second one fails for any reason. The whole streaming will fail or the final response will be "This is response"?
Maybe it is a dump question, in that case just ignore this.

@sonic182
Copy link
Member Author

I don't understand the "batch of streaming response", you mean that the connection may get lost while receiving data? if so, the chat will have the text it could read

I mean, each "batch", "iteration", "part" (name it as you like) of the streaming contains a part of the response. Lets imagine the response is fulfilled by 3 iterations ["This is", "a full", "response"] and the second one fails for any reason. The whole streaming will fail or the final response will be "This is response"? Maybe it is a dump question, in that case just ignore this.

whenever it fails, it will stop, it is not possible the case you describe

@dmoralesl dmoralesl self-requested a review July 29, 2025 15:12
Copy link
Contributor

@mmacia mmacia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're going to upgrade Tesla dep because of this PR I submitted to Tesla elixir-tesla/tesla#767

@sonic182
Copy link
Member Author

sonic182 commented Jul 29, 2025

You're going to upgrade Tesla dep because of this PR I submitted to Tesla elixir-tesla/tesla#767

This happens if the Provider is in http2, ok is interesting to know it.

Lib users could enforce http1.1 https://hexdocs.pm/finch/Finch.html#start_link/1-pool-configuration-options or just to specify a newer finch version with your fix whenever it arrives

@sonic182
Copy link
Member Author

You're going to upgrade Tesla dep because of this PR I submitted to Tesla elixir-tesla/tesla#767

I did update the min tesla version in mix.exs

@mmacia mmacia self-requested a review July 29, 2025 15:35
Copy link
Contributor

@hectorperez hectorperez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome Johanderson!

Two minor suggestions below and 🚀

sonic182 and others added 2 commits July 31, 2025 09:36
Co-authored-by: Hector Perez <hecpeare@gmail.com>
Co-authored-by: Hector Perez <hecpeare@gmail.com>
@sonic182 sonic182 merged commit 2c219e0 into master Jul 31, 2025
6 checks passed
@sonic182 sonic182 deleted the feature/stream_responses branch July 31, 2025 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants