Title:

Github Executorch · Github Executorch · commit 1d6e869b80a4 · 2026-01-22T08:18:41.000-08:00
Fix quantized_linear output shape in meta function.

Summary:
  Fix incorrect output shape computation in @register_fake for
  cortex_m::quantized_linear.

  The meta function was using weights.shape[0] (in_features) instead of
  weights.shape[1] (out_features) to compute the output tensor shape.
  This is incorrect because weights are stored in transposed format
  [in_features, out_features] for CMSIS-NN compatibility.

  For MobileNetV2 with Linear(1280, 1000), this caused:
  - Output shape: [1, 1280] instead of [1, 1000]
  - Post-processing failure: "Output size doesn't match labels' size"

  The C++ runtime (op_quantized_linear.cpp) correctly uses weights.size(1)
  for out_features, but the Python meta function was inconsistent.

  This fix ensures the AOT-compiled .pte file has correctly shaped output
  tensors for any model using quantized_linear (MV2, ResNet, MV3, etc.).
diff --git a/backends/cortex_m/ops/operators.py b/backends/cortex_m/ops/operators.py
@@ -352,7 +352,7 @@ def quantized_linear_meta(
     activation_min,
 ) -> torch.Tensor:
 
-    shape = (*input.shape[:-1], weights.shape[0])
+    shape = (*input.shape[:-1], weights.shape[1])
     return torch.empty(shape, dtype=input.dtype, device=input.device)