You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add an example and type enhancement for TextStreamer (#1066)
* typing: GenerationConfig option for TextStreamer
* docs: streaming example with following the style
* docs: streaming description from @xenova's suggestion
Co-authored-by: Joshua Lochner <[email protected]>
* fix: streaming example from @xenova's suggestion
Co-authored-by: Joshua Lochner <[email protected]>
* fix: <pre> tag by wrapping it in a <detail> tag
* fix: remove newlines for proper rendering
---------
Co-authored-by: Joshua Lochner <[email protected]>
Copy file name to clipboardExpand all lines: docs/source/pipelines.md
+64Lines changed: 64 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -148,6 +148,70 @@ Cheddar is my go-to for any occasion or mood;
148
148
It adds depth and richness without being overpowering its taste buds alone
149
149
```
150
150
151
+
### Streaming
152
+
153
+
Some pipelines such as `text-generation` or `automatic-speech-recognition` support streaming output. This is achieved using the `TextStreamer` class. For example, when using a chat model like `Qwen2.5-Coder-0.5B-Instruct`, you can specify a callback function that will be called with each generated token text (if unset, new tokens will be printed to the console).
- **Base Case**: If the array has less than or equal to one element (i.e., `len(arr)` is less than or equal to `1`), it is already sorted and can be returned as is.
205
+
- **Pivot Selection**: The pivot is chosen as the middle element of the array.
206
+
- **Partitioning**: The array is partitioned into three parts: elements less than the pivot (`left`), elements equal to the pivot (`middle`), and elements greater than the pivot (`right`). These partitions are then recursively sorted.
207
+
- **Recursive Sorting**: The subarrays are sorted recursively using `quick_sort`.
208
+
This approach ensures that each recursive call reduces the problem size by half until it reaches a base case.
209
+
</pre>
210
+
</details>
211
+
212
+
This streaming feature allows you to process the output as it is generated, rather than waiting for the entire output to be generated before processing it.
213
+
214
+
151
215
For more information on the available options for each pipeline, refer to the [API Reference](./api/pipelines).
152
216
If you would like more control over the inference process, you can use the [`AutoModel`](./api/models), [`AutoTokenizer`](./api/tokenizers), or [`AutoProcessor`](./api/processors) classes instead.
0 commit comments