Commit 2e469aa
[Bugfix] Fix circular references when activation offload device is cuda (#2387)
## Background ##
#2366 introduced a `WeakKeyDictionary` layer which caches shared
tensors. This is a good approach, but has an edge case where, if the
value of entry is identical to the key of the entry, then the key will
never be garbage collected.
This can occur if the user specifies `sequential_offload_device="cuda"`,
or if the AWQ offload device is "cuda" (default true in most cases).
## Purpose ##
* Fix memory leak in AWQ which led to very high CUDA memory usage
## Changes ##
* Guard against entries into the `WeakKeyDictionary` where the key and
value are identical
* Misc
* Move `OverrideEqMode` to the bottom of the `pipelines/cache.py`
* Remove `_fp16_baseline_cache`, which was not being used
## Testing ##
| Before Changes | After Changes |
| - | - |
| <img width="640" height="480" alt="awq_before"
src="https://github.com/user-attachments/assets/07714321-4b2f-49b7-aa2b-5c745a60d2f4"
/> | <img width="640" height="480" alt="awq_after"
src="https://github.com/user-attachments/assets/336b0e98-c24c-4e0c-a873-3166effc32b7"
/> |
---------
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: HDCharles <39544797+HDCharles@users.noreply.github.com>1 parent 9979e98 commit 2e469aa
2 files changed
+33
-36
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
172 | 172 | | |
173 | 173 | | |
174 | 174 | | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | 175 | | |
180 | 176 | | |
181 | 177 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | 15 | | |
47 | 16 | | |
48 | 17 | | |
| |||
289 | 258 | | |
290 | 259 | | |
291 | 260 | | |
292 | | - | |
| 261 | + | |
| 262 | + | |
293 | 263 | | |
294 | 264 | | |
295 | 265 | | |
| |||
326 | 296 | | |
327 | 297 | | |
328 | 298 | | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
0 commit comments