Skip to content

Commit 9172821

Browse files
author
ssjia
committed
Update on "[ET-VK] Miscellaneous fixes"
Collecting fixes for various models/ops in this diff/PR. They have all been squashed into this single change to make it easier to cherry pick. # Fixes ## Wav2Letter Type: Output correctness failure This is caused by a bug in swiftshader, and not reproducible on any other platform. Specifically, the issue is in the softmax shader; the exact cause of the issue is unknown, but it is related to using shared memory within shaders. The workaround for this issue is to use separate shared memory arrays for the shared max and shared sum. ## ConvNeXT Type: Exception during runtime This is caused by an incompatible memory layout being used for mean2d. More technically, the packed dimension of the tensor cannot be one of the dims being reduced. The current operator registry system did not have a way to select valid tensor representations based on the actual arguments of an op. To fix, we have to introduce a mechanism for ops to specify valid representations once a node's arguments are known. Once the model is exported with supported memory layout, the model test passes. ## Inception_V3/ViT Type: Exception during runtime The root cause of this was an interaction betwen the fuse batch norm pass and how `vulkan_preprocess.py` was applying passes. Essentially, the fuse batch norm pass creates a new param node for the fused weight, but after the pass is applied `_copy_module` is used to copy the transformed graph back into the ExportedProgram. However, it seems that _copy_module lowercases the node names without updating the exported program's graph signature. Therefore, subsequent passes couldn't recognize the weight tensor of convolution tensors as a constant/parameter node. The solution was to migrate vulkan_preprocess.py to use the _transform() API instead of using _copy_module. ## DenseNet 161 (w/ dynamic shapes) Type: Output Mismatch Cause: the native_batch_norm op doesn't support dynamic shapes. However, the backend test runner doesn't set the correct compile option to filter ops without dynamic shape support. Differential Revision: [D83703496](https://our.internmc.facebook.com/intern/diff/D83703496/) [ghstack-poisoned]
2 parents 8278657 + 4378507 commit 9172821

File tree

1 file changed

+2
-61
lines changed

1 file changed

+2
-61
lines changed

backends/vulkan/test/test_vulkan_passes.py

Lines changed: 2 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -94,66 +94,6 @@ def op_node_count(graph_module: torch.fx.GraphModule, canonical_op_name: str) ->
9494

9595

9696
class TestVulkanPasses(unittest.TestCase):
97-
def test_fuse_int8pack_mm(self):
98-
K = 256
99-
N = 256
100-
model = SingleLinearModule(K, N)
101-
sample_inputs = model.get_sample_inputs()
102-
103-
quantizer = VulkanQuantizer()
104-
quantizer.set_global(
105-
get_symmetric_quantization_config(is_dynamic=False, weight_bits=8)
106-
)
107-
108-
edge_manager = quantize_and_lower_module(
109-
model,
110-
sample_inputs,
111-
quantizer,
112-
)
113-
114-
ep = edge_manager._edge_programs["forward"]
115-
edge_manager.transform(
116-
[
117-
AddmmToLinearTransform(),
118-
FuseQuantizedOpsTransform(ep),
119-
]
120-
)
121-
122-
gm = ep.graph_module
123-
124-
self.assertEqual(op_node_count(gm, "_weight_int8pack_mm.default"), 1)
125-
self.assertEqual(op_node_count(gm, "dequantize_per_channel.default"), 0)
126-
127-
def test_fuse_linear_qcs4w(self):
128-
K = 256
129-
N = 256
130-
model = SingleLinearModule(K, N)
131-
sample_inputs = model.get_sample_inputs()
132-
133-
quantizer = VulkanQuantizer()
134-
quantizer.set_global(
135-
get_symmetric_quantization_config(is_dynamic=False, weight_bits=4)
136-
)
137-
138-
edge_manager = quantize_and_lower_module(
139-
model,
140-
sample_inputs,
141-
quantizer,
142-
)
143-
144-
ep = edge_manager._edge_programs["forward"]
145-
edge_manager.transform(
146-
[
147-
AddmmToLinearTransform(),
148-
FuseQuantizedOpsTransform(ep),
149-
]
150-
)
151-
152-
gm = ep.graph_module
153-
154-
self.assertEqual(op_node_count(gm, "linear_qcs4w.default"), 1)
155-
self.assertEqual(op_node_count(gm, "dequantize_per_channel.default"), 0)
156-
15797
def test_fuse_rotary_emb(self):
15898
"""Test conversion of rotary embedding pattern to et_vk.apply_rotary_emb custom op."""
15999

@@ -238,7 +178,8 @@ def _reshape_for_broadcast(self, freqs_cis: torch.Tensor, x: torch.Tensor):
238178

239179
# Apply the rotary embedding pass
240180
ep = edge_manager._edge_programs["forward"]
241-
rotary_pass = FusePatternsPass(ep)
181+
rotary_pass = FusePatternsPass()
182+
rotary_pass._exported_program = ep
242183
result = rotary_pass.call(ep.graph_module)
243184

244185
# Verify that the pass was successful

0 commit comments

Comments
 (0)