Commit 22a9e78
Decompose after export in export_llama (#15951)
Summary:
`unwrap_tensor_subclass` was not unwrapping nested lora linears. This meant qdata/scale/zero were bundled together in the subclass, and separated at run decompositions inside to_edge_transform_and_lower. This is after nodes are tagged, meaning that the scales were not tagged, and remained in the PTE file after the rest of the weights were moved to a PTD file.
It's recommended to move away from `unwrap_tensor_subclass` and rely on export + decomps. This PR adds a decomp after exporting in export_llama, and removes cases of `unwrap_tensor_subclass`.
TODO: remove all cases of `unwrap_tensor_subclass` in ET.
Test Plan:
Add check that quantized weights are in PTD file (not PTE file) after quantization. This is a simple check, nested linears seem to be the real issue that decomposing resolves. TODO to add a test for that (probably e2e test with stories in subsequent PR)
```
python -m unittest executorch.backends.xnnpack.test.passes.test_propagate_custom_meta_pass
```
Reviewed By: metascroy
Differential Revision: D87826410
Pulled By: lucylq1 parent 7fa93a7 commit 22a9e78
File tree
3 files changed
+34
-13
lines changed- backends/xnnpack/test/passes
- examples/models/llama/source_transformation
- extension/llm/export
3 files changed
+34
-13
lines changedLines changed: 28 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
23 | 26 | | |
24 | 27 | | |
25 | 28 | | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
26 | 32 | | |
27 | 33 | | |
28 | 34 | | |
| |||
87 | 93 | | |
88 | 94 | | |
89 | 95 | | |
90 | | - | |
| 96 | + | |
91 | 97 | | |
92 | 98 | | |
93 | 99 | | |
| |||
106 | 112 | | |
107 | 113 | | |
108 | 114 | | |
109 | | - | |
| 115 | + | |
110 | 116 | | |
111 | 117 | | |
112 | 118 | | |
| |||
122 | 128 | | |
123 | 129 | | |
124 | 130 | | |
| 131 | + | |
| 132 | + | |
125 | 133 | | |
126 | 134 | | |
127 | 135 | | |
| |||
132 | 140 | | |
133 | 141 | | |
134 | 142 | | |
135 | | - | |
| 143 | + | |
136 | 144 | | |
137 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
138 | 153 | | |
139 | 154 | | |
140 | 155 | | |
| |||
156 | 171 | | |
157 | 172 | | |
158 | 173 | | |
159 | | - | |
| 174 | + | |
160 | 175 | | |
161 | 176 | | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | 197 | | |
201 | 198 | | |
202 | 199 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
42 | 41 | | |
43 | 42 | | |
44 | 43 | | |
| |||
203 | 202 | | |
204 | 203 | | |
205 | 204 | | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | 205 | | |
212 | 206 | | |
213 | 207 | | |
| |||
226 | 220 | | |
227 | 221 | | |
228 | 222 | | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
| |||
0 commit comments