Skip to content

Commit 1deeb4e

Browse files
committed
Cortex_m backend: Simplify add + linear fusion passes
Reuses the FoldAndAnnotateQParamsPass from the Arm backend to greatly simplify the logic for fusing the ops. Additionally updates the linear kernel to be numerically correct and computes the kernel_sum aot in the quantized_linear_fusion pass. Note that since this replaces the bias node it typically causes no extra memory useage. Updates the Linear tests to mirror this, including removing the various matmul tests. Since the linear is handled as a separate op rather than a particular type of matmul these tests are not related anymore. Removes unnecessary stub definitions in operatos.py, operators.yaml and op_quantized_linear.cpp Leaving a few TODO:s since the patch is large already. Signed-off-by: Adrian Lundell <[email protected]> Change-Id: I194228ee3ae4b64a92f3f818afb2e045cc3acf91
1 parent ad0bb51 commit 1deeb4e

File tree

9 files changed

+353
-1430
lines changed

9 files changed

+353
-1430
lines changed

backends/cortex_m/ops/cmsis_scratch_buffer_context.h

Lines changed: 0 additions & 187 deletions
This file was deleted.

backends/cortex_m/ops/cortex_m_ops_common.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ using Tensor = torch::executor::Tensor;
2222
using ScalarType = executorch::aten::ScalarType;
2323
using Scalar = torch::executor::Scalar;
2424
using Error = executorch::runtime::Error;
25+
using IntArrayRef = executorch::aten::ArrayRef<int64_t>;
2526

2627
// From arm_nn_math_types.h
2728
#define ARM_NN_Q31_MAX ((int32_t)(0x7FFFFFFFL))

0 commit comments

Comments
 (0)