Commit 817c459
Remove layer norm from the default quantizer, add one that has it
Summary:
Layer norm is not performing great in quantized mode, and is currently using a split scheme (weights are quantized, activations are not). In most cases, it's actually much faster to keep it fp32, so this diff removes it from the default quantizer.
We add a CadenceWithLayerNormQuantizer for easy access to the current behavior, which can be good in some cases (mostly if quantizing layer norm will help extend the quantized liveness).
Differential Revision: D729417901 parent 82ff404 commit 817c459
1 file changed
+13
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
196 | | - | |
197 | 196 | | |
198 | 197 | | |
199 | 198 | | |
| |||
236 | 235 | | |
237 | 236 | | |
238 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
239 | 250 | | |
240 | 251 | | |
241 | | - | |
| 252 | + | |
242 | 253 | | |
243 | 254 | | |
244 | 255 | | |
| |||
0 commit comments