move import of VK 4-bit source quantizer into function

nathanaelsee · facebook-github-bot · commit 8a4dde9b45c8 · 2025-02-26T12:33:17.000-08:00
Summary:
Since this import is at the global scope of the module, this import only works if llama/source_transformation module is being used in libraries where the vulkan dependency is already included.
In some cases this dep can fail to be linked causing export script failures.

Moving the import to where it's actually needed.

Differential Revision: D70268708
diff --git a/examples/models/llama/source_transformation/quantize.py b/examples/models/llama/source_transformation/quantize.py
@@ -14,8 +14,6 @@
 import torch.nn as nn
 import torch.nn.functional as F
 
-from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
-
 from executorch.extension.llm.export.builder import DType
 
 from sentencepiece import SentencePieceProcessor
@@ -180,6 +178,8 @@ def quantize(  # noqa C901
         model = gptq_quantizer.quantize(model, inputs)
         return model
     elif qmode == "vulkan_4w":
+        from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
+        
         q_group_size = 256 if group_size is None else group_size
         model = VkInt4WeightOnlyQuantizer(groupsize=q_group_size).quantize(model)