Skip to content

Commit a76485f

Browse files
nathanaelseefacebook-github-bot
authored andcommitted
move import of VK 4-bit source quantizer into function (#8744)
Summary: Since this import is at the global scope of the module, this import only works if llama/source_transformation module is being used in libraries where the vulkan dependency is already included. In some cases this dep can fail to be linked causing export script failures. Moving the import to where it's actually needed. Reviewed By: derekxu, SS-JIA Differential Revision: D70268708
1 parent 1caa0a1 commit a76485f

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

examples/models/llama/source_transformation/quantize.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@
1414
import torch.nn as nn
1515
import torch.nn.functional as F
1616

17-
from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
18-
1917
from executorch.extension.llm.export.builder import DType
2018

2119
from sentencepiece import SentencePieceProcessor
@@ -180,6 +178,8 @@ def quantize( # noqa C901
180178
model = gptq_quantizer.quantize(model, inputs)
181179
return model
182180
elif qmode == "vulkan_4w":
181+
from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
182+
183183
q_group_size = 256 if group_size is None else group_size
184184
model = VkInt4WeightOnlyQuantizer(groupsize=q_group_size).quantize(model)
185185

0 commit comments

Comments
 (0)