move import of VK 4-bit source quantizer into function (#8744)

nathanaelsee · facebook-github-bot · commit a76485fca4e1 · 2025-02-26T13:03:27.000-08:00
Summary:

Since this import is at the global scope of the module, this import only works if llama/source_transformation module is being used in libraries where the vulkan dependency is already included.
In some cases this dep can fail to be linked causing export script failures.

Moving the import to where it's actually needed.

Reviewed By: derekxu, SS-JIA

Differential Revision: D70268708
diff --git a/examples/models/llama/source_transformation/quantize.py b/examples/models/llama/source_transformation/quantize.py
@@ -14,8 +14,6 @@
 import torch.nn as nn
 import torch.nn.functional as F
 
-from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
-
 from executorch.extension.llm.export.builder import DType
 
 from sentencepiece import SentencePieceProcessor
@@ -180,6 +178,8 @@ def quantize(  # noqa C901
         model = gptq_quantizer.quantize(model, inputs)
         return model
     elif qmode == "vulkan_4w":
+        from executorch.backends.vulkan._passes import VkInt4WeightOnlyQuantizer
+
         q_group_size = 256 if group_size is None else group_size
         model = VkInt4WeightOnlyQuantizer(groupsize=q_group_size).quantize(model)