Commit 6ddd036
[AWQ] Add activation_hook_target field for custom activation cache hooking (vllm-project#2346)
## Summary
- Adds an optional `activation_hook_target` field to `AWQMapping` that
lets users specify which submodule (relative to the parent/LCA) to hook
for activation caching, replacing the hardcoded `hasattr(parent, 'mlp')`
workaround for MoE models with parallel transformer blocks.
- When `activation_hook_target` is `None` (default), behavior is
unchanged: the hook is placed on `balance_layers[0]`. When set (e.g.
`"mlp"`), it resolves to the corresponding submodule on the parent via
`getattr_chain`.
## Motivation
In parallel transformer architectures, attention and MLP run in parallel
from the same input. The existing code always hooks `balance_layers[0]`
for activation caching, which captures the wrong activations when
balance layers span both attention and MLP branches. There was a
commented-out `hasattr(parent, 'mlp')` workaround, but it was brittle
and not generalizable. This change makes the hook target explicitly
configurable per mapping.
## Test
I've tested this change with our internal models, and it aligns with
previous results.
---------
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: HDCharles <39544797+HDCharles@users.noreply.github.com>1 parent b0463d1 commit 6ddd036
2 files changed
+68
-9
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
79 | 83 | | |
80 | 84 | | |
81 | 85 | | |
| |||
122 | 126 | | |
123 | 127 | | |
124 | 128 | | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
125 | 134 | | |
126 | 135 | | |
127 | 136 | | |
| |||
389 | 398 | | |
390 | 399 | | |
391 | 400 | | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
392 | 412 | | |
393 | 413 | | |
394 | 414 | | |
| |||
397 | 417 | | |
398 | 418 | | |
399 | 419 | | |
| 420 | + | |
400 | 421 | | |
401 | 422 | | |
402 | 423 | | |
| |||
468 | 489 | | |
469 | 490 | | |
470 | 491 | | |
471 | | - | |
472 | | - | |
473 | | - | |
474 | | - | |
475 | | - | |
476 | | - | |
477 | | - | |
478 | | - | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
479 | 498 | | |
480 | | - | |
| 499 | + | |
481 | 500 | | |
482 | 501 | | |
483 | 502 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
19 | 29 | | |
20 | 30 | | |
21 | 31 | | |
22 | 32 | | |
| 33 | + | |
23 | 34 | | |
24 | 35 | | |
25 | 36 | | |
| |||
181 | 192 | | |
182 | 193 | | |
183 | 194 | | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
184 | 219 | | |
185 | 220 | | |
186 | 221 | | |
| |||
223 | 258 | | |
224 | 259 | | |
225 | 260 | | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
226 | 265 | | |
227 | 266 | | |
228 | 267 | | |
| |||
231 | 270 | | |
232 | 271 | | |
233 | 272 | | |
| 273 | + | |
234 | 274 | | |
235 | 275 | | |
236 | 276 | | |
| |||
0 commit comments