Commit 881dd46
authored
[Bugfix] Reduce device movement while checking layer divisibility (#2385)
## Purpose ##
* Improve runtime and memory usage by checking the shape of the
offloaded weight, not the onloaded weight
## Changes ##
* Wrap all calls to `_layer_indivisible` with the `disable_onloading`
context
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>1 parent 70b610a commit 881dd46
File tree
1 file changed
+14
-11
lines changed- src/llmcompressor/modifiers/quantization
1 file changed
+14
-11
lines changedLines changed: 14 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
32 | | - | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
90 | 93 | | |
91 | 94 | | |
92 | 95 | | |
| |||
0 commit comments