Commit 353f556
Feature/intermediates cache prefetch (#2392)
Optional prefetch was added to the intermediates cache and wired into
AWQ when offloading.
IntermediatesCache
New method iter_prefetch() iterates over batches like iter() but
prefetches the next batch in a background thread so onload from the
offload device overlaps with use of the current batch, reducing
wall‑clock time when offloading to CPU.
**AWQ**
When offload_device is set, _run_samples() uses cache.iter_prefetch()
instead of the cache iterator so CPU→device onload overlaps with the
forward pass over cached parent args during smoothing.
**Tests**
Two tests were added: one that prefetch yields the same batches as
iter(), and one that prefetch on an empty cache yields nothing. No new
public API; prefetch is used automatically when AWQ offloads.
Fix: #2374
---------
Signed-off-by: Avishek Goswami <avishek.goswami@ibm.com>
Co-authored-by: Avishek Goswami <avishek.goswami@ibm.com>
Co-authored-by: HDCharles <39544797+HDCharles@users.noreply.github.com>1 parent 822668a commit 353f556
File tree
5 files changed
+114
-27
lines changed- src/llmcompressor
- core
- modifiers/awq
- pipelines
- sequential
- tests/llmcompressor/pipelines
5 files changed
+114
-27
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
| 107 | + | |
107 | 108 | | |
108 | 109 | | |
109 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
607 | 607 | | |
608 | 608 | | |
609 | 609 | | |
610 | | - | |
611 | | - | |
612 | | - | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
613 | 614 | | |
614 | 615 | | |
615 | 616 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
196 | 197 | | |
197 | 198 | | |
198 | 199 | | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
199 | 253 | | |
200 | 254 | | |
201 | 255 | | |
| |||
215 | 269 | | |
216 | 270 | | |
217 | 271 | | |
218 | | - | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
219 | 280 | | |
220 | 281 | | |
221 | 282 | | |
| |||
259 | 320 | | |
260 | 321 | | |
261 | 322 | | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
262 | 330 | | |
263 | 331 | | |
264 | 332 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | 2 | | |
4 | 3 | | |
5 | 4 | | |
| |||
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
38 | | - | |
| 37 | + | |
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
42 | 41 | | |
43 | | - | |
| 42 | + | |
| 43 | + | |
44 | 44 | | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
62 | 54 | | |
63 | 55 | | |
64 | 56 | | |
| |||
139 | 131 | | |
140 | 132 | | |
141 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
142 | 137 | | |
143 | 138 | | |
144 | 139 | | |
145 | 140 | | |
146 | 141 | | |
147 | 142 | | |
148 | 143 | | |
149 | | - | |
150 | 144 | | |
151 | 145 | | |
152 | 146 | | |
153 | 147 | | |
154 | 148 | | |
155 | 149 | | |
156 | 150 | | |
157 | | - | |
| 151 | + | |
158 | 152 | | |
159 | 153 | | |
160 | 154 | | |
| |||
169 | 163 | | |
170 | 164 | | |
171 | 165 | | |
172 | | - | |
| 166 | + | |
173 | 167 | | |
174 | 168 | | |
175 | 169 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
56 | 79 | | |
57 | 80 | | |
58 | 81 | | |
| |||
0 commit comments