Skip to content

Commit 8a4dde9

Browse files
nathanaelseefacebook-github-bot
authored andcommitted
move import of VK 4-bit source quantizer into function
Summary: Since this import is at the global scope of the module, this import only works if llama/source_transformation module is being used in libraries where the vulkan dependency is already included. In some cases this dep can fail to be linked causing export script failures. Moving the import to where it's actually needed. Differential Revision: D70268708
1 parent 5b32a80 commit 8a4dde9

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

examples/models/llama/source_transformation/quantize.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@
1414
import torch.nn as nn
1515
import torch.nn.functional as F
1616

17-
from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
18-
1917
from executorch.extension.llm.export.builder import DType
2018

2119
from sentencepiece import SentencePieceProcessor
@@ -180,6 +178,8 @@ def quantize( # noqa C901
180178
model = gptq_quantizer.quantize(model, inputs)
181179
return model
182180
elif qmode == "vulkan_4w":
181+
from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
182+
183183
q_group_size = 256 if group_size is None else group_size
184184
model = VkInt4WeightOnlyQuantizer(groupsize=q_group_size).quantize(model)
185185

0 commit comments

Comments
 (0)