Commit fb29875

ssjia

committed

Update on "[ET-VK] Enable IntxWeightOnlyConfig"

## Motivation Be able to test Vulkan lowering via optimum-executorch. ## Context Very similar to the below PR, Int4 weight only quantization is currently enabled in Vulkan via a custom source transform quantizer that replaces linear layers with a custom linear layer that calls a custom weight only quantized linear op. This diff aims to make it so that no Vulkan specific source transforms need to be applied by adding a fusion pattern for weight only quantized linear. ## Changes * Introduce a fusable graph pattern for weight only quantized linear * Add fusion logic for weight only quantized linear in the fuse patterns pass * Add `4w` qmode to the export llama script Differential Revision: [D80293302](https://our.internmc.facebook.com/intern/diff/D80293302/) [ghstack-poisoned]

2 parents 3eb4099 + 78b8321 commit fb29875Copy full SHA for fb29875

0 file changed

-0

lines changed

0 file changed

-0

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit fb29875

0 file changed

0 file changed

File tree

0 file changed

0 file changed

0 commit comments