Conversation
dmoralesl
left a comment
There was a problem hiding this comment.
The implementation looks clean and good. But I have a couple of questions about how this works:
- The
streamoption and thestructured_outputfeature are compatible? How they can work together? If it is not, how this feature helps the AIAssistant to provide real-time outputs? - What will happen when a batch of the streaming response fail? The response will have a missing part or the whole request will fail? Is it possible what I'm saying or the providers (Ollama, Openrouter...) already handle these cases for us?
Yes, it is compatible, it is a bit problematic to handle it but it is allowed (see https://openrouter.ai/docs/features/structured-outputs#streaming-with-structured-outputs)
I don't understand the "batch of streaming response", you mean that the connection may get lost while receiving data? if so, the chat will have the text it could read |
I mean, each "batch", "iteration", "part" (name it as you like) of the streaming contains a part of the response. Lets imagine the response is fulfilled by 3 iterations |
whenever it fails, it will stop, it is not possible the case you describe |
mmacia
left a comment
There was a problem hiding this comment.
You're going to upgrade Tesla dep because of this PR I submitted to Tesla elixir-tesla/tesla#767
This happens if the Provider is in http2, ok is interesting to know it. Lib users could enforce http1.1 https://hexdocs.pm/finch/Finch.html#start_link/1-pool-configuration-options or just to specify a newer finch version with your fix whenever it arrives |
I did update the min tesla version in mix.exs |
hectorperez
left a comment
There was a problem hiding this comment.
Awesome Johanderson!
Two minor suggestions below and 🚀
Co-authored-by: Hector Perez <hecpeare@gmail.com>
Co-authored-by: Hector Perez <hecpeare@gmail.com>
Stream responses from providers
Requires the usage of Finch adapter, maybe other Tesla adapters works
for now: