You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* stream responses for openai and openrouter
* default config
* fix credo
* stream read response from ollama
* readme and function in llm_composer mod
* more info in README
* added note
* fix example
* a bit more doc
* lower finch min
* updated min version of Tesla because fix in finch
* fix in openrouter
* Update lib/llm_composer/http_client.ex
Co-authored-by: Hector Perez <hecpeare@gmail.com>
* Update README.md
Co-authored-by: Hector Perez <hecpeare@gmail.com>
---------
Co-authored-by: Hector Perez <hecpeare@gmail.com>
-**OpenRouter** offers the most comprehensive feature set, including unique capabilities like fallback models and provider routing
33
+
-**Bedrock** support is provided via AWS ExAws integration and requires proper AWS configuration
34
+
-**Ollama** requires an ollama server instance to be running
35
+
-**Function Calls** require the provider to support OpenAI-compatible function calling format
36
+
-**Streaming** is **not** compatible with Tesla **retries**.
37
+
18
38
## Usage
19
39
20
40
### Simple Bot Definition
@@ -150,7 +170,65 @@ LlmComposer.Message.new(
150
170
)
151
171
```
152
172
153
-
No function calls support in Ollama (for now)
173
+
**Note:** Ollama does not provide token usage information, so `input_tokens` and `output_tokens` will always be empty in debug logs and response metadata. Function calls are also not supported with Ollama.
174
+
175
+
### Streaming Responses
176
+
177
+
LlmComposer supports streaming responses for real-time output, which is particularly useful for long-form content generation. This feature works with providers that support streaming (like Ollama, OpenRouter and OpenAI).
178
+
179
+
```elixir
180
+
# Make sure to configure Tesla adapter for streaming (Finch recommended)
Once upon a time, in the vast expanse of space, a brave astronaut embarked on a journey to explore distant galaxies. The stars shimmered as the spaceship soared beyond the known universe, uncovering secrets of the cosmos...
220
+
221
+
--- Stream complete ---
222
+
```
223
+
224
+
**Note:** The `stream_response: true` setting enables streaming mode, and `parse_stream_response/1` filters and parses the raw stream data into usable content chunks.
225
+
226
+
**Important:** When using Stream read chat completion, LlmComposer does not track input/output/cache/thinking tokens. There are two approaches to handle token counting in this mode:
227
+
228
+
1. Calculate tokens using libraries like `tiktoken` for OpenAI provider.
229
+
2. Read token data from the last stream object if the provider supplies it (currently only OpenRouter supports this).
230
+
231
+
In Ollama provider, we do not track tokens.
154
232
155
233
### Using OpenRouter
156
234
@@ -334,4 +412,3 @@ In this example, the bot first calls OpenAI to understand the user's intent and
334
412
Documentation can be generated with [ExDoc](https://github.com/elixir-lang/ex_doc)
335
413
and published on [HexDocs](https://hexdocs.pm). Once published, the docs can
0 commit comments