Commit be8ffd1
authored
[llm] Add generate_from_pos API to LLM runner (#11570)
As titled, this API allows us to support multi-turn conversation by
passing in a `start_pos` argument to `generate_from_pos`.
This pull request introduces a new feature to support text generation
from a specific starting position (`generate_from_pos`) and includes
updates to ensure proper error handling and functionality when
`max_new_tokens` is negative. The changes primarily focus on extending
the `TextLLMRunner` class and its associated methods to accommodate this
new feature while maintaining backward compatibility.
### New Feature: Text Generation from a Specific Starting Position
* **Added `generate_from_pos` Method**: Introduced a new method
`generate_from_pos` in `TextLLMRunner` to allow text generation starting
from a specified position in the KV cache. This includes updates to the
method signature, logic, and error handling.
(`extension/llm/runner/text_llm_runner.cpp`
[[1]](diffhunk://#diff-9b3bd38c0b1ad81b18afab15784634e2b394fda448f5e2dae03de58870751440L76-R78)
[[2]](diffhunk://#diff-9b3bd38c0b1ad81b18afab15784634e2b394fda448f5e2dae03de58870751440R129-R156)
[[3]](diffhunk://#diff-9b3bd38c0b1ad81b18afab15784634e2b394fda448f5e2dae03de58870751440L150-R165)
[[4]](diffhunk://#diff-9b3bd38c0b1ad81b18afab15784634e2b394fda448f5e2dae03de58870751440R219-R225);
`extension/llm/runner/text_llm_runner.h`
[[5]](diffhunk://#diff-d1aa44a87ea9b7ec51250c2002466cb9bd57db153c1c8b58ffdf73e8f231a89bR98-R122)
* **Updated Documentation**: Enhanced method documentation in
`TextLLMRunner` to describe the new functionality, including parameters
like `start_pos` and the expected behavior.
(`extension/llm/runner/text_llm_runner.h`
[[1]](diffhunk://#diff-d1aa44a87ea9b7ec51250c2002466cb9bd57db153c1c8b58ffdf73e8f231a89bL81-R83)
[[2]](diffhunk://#diff-d1aa44a87ea9b7ec51250c2002466cb9bd57db153c1c8b58ffdf73e8f231a89bR98-R122)
### Error Handling Improvements
* **Validation for `max_new_tokens`**: Added checks to ensure
`max_new_tokens` is positive. If it is not, an `InvalidArgument` error
is returned. This prevents invalid configurations during text
generation. (`extension/llm/runner/text_llm_runner.cpp`
[extension/llm/runner/text_llm_runner.cppR129-R156](diffhunk://#diff-9b3bd38c0b1ad81b18afab15784634e2b394fda448f5e2dae03de58870751440R129-R156))
* **Unit Test for Negative `max_new_tokens`**: Created a new test case
(`GenerateFromPosErrorsWithNegativeMaxNewTokens`) to verify that the
`generate_from_pos` method correctly handles scenarios where
`max_new_tokens` is negative.
(`extension/llm/runner/test/test_text_llm_runner.cpp`
[extension/llm/runner/test/test_text_llm_runner.cppR325-R379](diffhunk://#diff-0a1e69b4182878ccad887c4f4ba3929ef24082a26623e26a871d73f4e6cea503R325-R379))1 parent 1309849 commit be8ffd1
File tree
4 files changed
+134
-14
lines changed- extension/llm/runner
- test
4 files changed
+134
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
124 | 141 | | |
125 | 142 | | |
126 | 143 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
322 | 322 | | |
323 | 323 | | |
324 | 324 | | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
77 | 77 | | |
| 78 | + | |
78 | 79 | | |
79 | 80 | | |
80 | 81 | | |
| |||
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| 129 | + | |
| 130 | + | |
128 | 131 | | |
129 | 132 | | |
130 | | - | |
131 | | - | |
| 133 | + | |
| 134 | + | |
132 | 135 | | |
133 | 136 | | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
142 | 157 | | |
143 | 158 | | |
144 | 159 | | |
| |||
147 | 162 | | |
148 | 163 | | |
149 | 164 | | |
150 | | - | |
| 165 | + | |
151 | 166 | | |
152 | 167 | | |
153 | 168 | | |
| |||
201 | 216 | | |
202 | 217 | | |
203 | 218 | | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
204 | 226 | | |
205 | 227 | | |
206 | 228 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
81 | | - | |
82 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
83 | 84 | | |
84 | 85 | | |
85 | 86 | | |
| |||
94 | 95 | | |
95 | 96 | | |
96 | 97 | | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
97 | 123 | | |
98 | 124 | | |
99 | 125 | | |
| |||
0 commit comments