Skip to content

Commit 9bfe6ef

Browse files
author
ssjia
committed
Update on "[ET-VK][AOT] Enable exporting Q8 Quantized Linear + Convolution"
As title. Introduce fusion patterns to enable fusing quantized convolution and linear graph patterns into a custom op. ## Changes Introduce the concept of using custom pattern detection functions to detect graph patterns rather than solely relying on SubgraphMatcher. The issue with SubgraphMatcher is that a large number of graph patterns may need to be exported to obtain variants for different combinations of decompositions/quantization workflows. Having a custom detection function improves maintainability. Implement detection + replacement functions for quantized linear and quantized conv2d. Differential Revision: [D81323425](https://our.internmc.facebook.com/intern/diff/D81323425/) [ghstack-poisoned]
2 parents 5b8fdb9 + 9caf470 commit 9bfe6ef

File tree

113 files changed

+2198
-1656
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

113 files changed

+2198
-1656
lines changed

.github/workflows/apple-perf.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -416,7 +416,7 @@ jobs:
416416
- set-parameters
417417
secrets: inherit
418418
with:
419-
runner: macos-latest-xlarge
419+
runner: macos-14-xlarge
420420
python-version: '3.11'
421421
submodules: 'recursive'
422422
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}

.github/workflows/apple.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ jobs:
4949
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
5050
secrets: inherit
5151
with:
52-
runner: macos-latest-xlarge
52+
runner: macos-14-xlarge
5353
python-version: '3.11'
5454
submodules: 'recursive'
5555
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
@@ -136,7 +136,7 @@ jobs:
136136
needs: set-version
137137
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
138138
with:
139-
runner: macos-latest-xlarge
139+
runner: macos-14-xlarge
140140
python-version: '3.11'
141141
submodules: 'recursive'
142142
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
@@ -276,7 +276,7 @@ jobs:
276276
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
277277
secrets: inherit
278278
with:
279-
runner: macos-latest-xlarge
279+
runner: macos-14-xlarge
280280
python-version: '3.11'
281281
submodules: 'recursive'
282282
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}

.github/workflows/build-presets.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
with:
2323
job-name: build
2424
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
25-
runner: macos-latest-xlarge
25+
runner: macos-14-xlarge
2626
python-version: 3.12
2727
submodules: recursive
2828
timeout: 90

.github/workflows/build-wheels-macos.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ jobs:
6363
post-script: ${{ matrix.post-script }}
6464
package-name: ${{ matrix.package-name }}
6565
# Meta's macOS runners do not have Xcode, so use GitHub's runners.
66-
runner-type: macos-latest-xlarge
66+
runner-type: macos-14-xlarge
6767
setup-miniconda: true
6868
smoke-test-script: ${{ matrix.smoke-test-script }}
6969
trigger-event: ${{ github.event_name }}

.github/workflows/pull.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -855,7 +855,8 @@ jobs:
855855
.ci/scripts/setup-linux.sh --build-tool "cmake"
856856
857857
# Install test requirements
858-
pip install -r backends/nxp/requirements-tests.txt
858+
pip install -r backends/nxp/requirements-tests-pypi.txt
859+
pip install -r backends/nxp/requirements-tests-eiq.txt
859860
860861
# Run pytest
861862
PYTHON_EXECUTABLE=python bash backends/nxp/run_unittests.sh

.github/workflows/trunk.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -435,7 +435,7 @@ jobs:
435435
name: test-coreml-delegate
436436
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
437437
with:
438-
runner: macos-latest-xlarge
438+
runner: macos-14-xlarge
439439
python-version: '3.11'
440440
submodules: 'recursive'
441441
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}

backends/apple/mps/serialization/mps_graph_serialize.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
# Copyright (c) Meta Platforms, Inc. and affiliates.
22
# All rights reserved.
3+
# Copyright 2025 Arm Limited and/or its affiliates.
34
#
45
# This source code is licensed under the BSD-style license found in the
56
# LICENSE file in the root directory of this source tree.
67

8+
import importlib.resources as _resources
79
import json
810
import os
911
import tempfile
1012

11-
import pkg_resources
13+
import executorch.backends.apple.mps.serialization as serialization_package
1214
from executorch.backends.apple.mps.serialization.mps_graph_schema import MPSGraph
1315
from executorch.exir._serialize._dataclass import _DataclassEncoder
1416
from executorch.exir._serialize._flatbuffer import _flatc_compile
@@ -19,7 +21,9 @@ def convert_to_flatbuffer(mps_graph: MPSGraph) -> bytes:
1921
with tempfile.TemporaryDirectory() as d:
2022
schema_path = os.path.join(d, "schema.fbs")
2123
with open(schema_path, "wb") as schema_file:
22-
schema_file.write(pkg_resources.resource_string(__name__, "schema.fbs"))
24+
schema_file.write(
25+
_resources.read_binary(serialization_package, "schema.fbs")
26+
)
2327
json_path = os.path.join(d, "schema.json")
2428
with open(json_path, "wb") as json_file:
2529
json_file.write(mps_graph_json.encode("ascii"))

backends/arm/CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,10 @@ if(EXECUTORCH_BUILD_VGF)
7373
# vgf backend
7474
list(TRANSFORM _vgf_backend_sources PREPEND "${EXECUTORCH_ROOT}/")
7575
add_library(vgf_backend ${_vgf_backend_sources})
76+
install(TARGETS vgf_backend EXPORT ExecuTorchTargets)
7677
target_include_directories(
77-
vgf_backend PUBLIC ${_common_include_directories} ${VULKAN_HEADERS_PATH}
78-
${VOLK_HEADERS_PATH}
78+
vgf_backend PRIVATE ${_common_include_directories} ${VULKAN_HEADERS_PATH}
79+
${VOLK_HEADERS_PATH}
7980
)
8081
target_compile_options(
8182
vgf_backend PRIVATE -DUSE_VULKAN_WRAPPER -DUSE_VULKAN_VOLK

backends/arm/README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,19 @@ You can test to run some models with the full fvp test flow
8888
backends/arm/test/test_arm_baremetal.sh test_full_ethosu_fvp
8989
```
9090

91+
To run the unit test suite with VKML use the following. Note Vulkan SDK need to be installed.
92+
Have a look at install_vulkan_sdk() in .ci/scripts/setup-vulkan-linux-deps.sh on how to install Vulkan SDK.
93+
94+
```
95+
backends/arm/test/test_arm_baremetal.sh test_pytest_vkml
96+
```
97+
98+
You can test to run some models with the full VKML flow
99+
100+
```
101+
backends/arm/test/test_arm_baremetal.sh test_full_vkml
102+
```
103+
91104
## Unit tests
92105

93106
This is the structure of the test directory
@@ -102,6 +115,7 @@ test # Root test folder
102115
├── tosautil # Utility functions for TOSA artifacts
103116
├ common.py # Common functions and definitions used by many tests
104117
├ setup_testing.sh # Script to prepare testing for using the Corstone 3x0 FVP
118+
├ setup_testing_vkml.sh # Script to prepare testing for using the VKML
105119
├ test_arm_baremetal.sh # Help script to trigger testing
106120
```
107121

@@ -123,7 +137,7 @@ first you need to build and prepare some used target libs
123137

124138
```
125139
examples/arm/run.sh --model_name=add --build_only
126-
backends/arm/test/setup_testing.sh
140+
backends/arm/test/setup_testing.sh and/or backends/arm/test/setup_testing_vkml.sh
127141
```
128142

129143
The you can run the tests with

backends/arm/scripts/build_executor_runner.sh

Lines changed: 39 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -33,24 +33,26 @@ build_with_etdump_flags=" -DEXECUTORCH_ENABLE_EVENT_TRACER=OFF "
3333
help() {
3434
echo "Usage: $(basename $0) [options]"
3535
echo "Options:"
36-
echo " --pte=<PTE_FILE>|semihosting pte file (generated by the aot_arm_compier from the model to include in the elf), or semihosting to supply pte at runtime."
37-
echo " --target=<TARGET> Target to build and run for Default: ${target}"
38-
echo " --build_type=<TYPE> Build with Release, Debug or RelWithDebInfo, default is ${build_type}"
39-
echo " --bundleio Support both pte and Bundle IO bpte using Devtools BundelIO with Input/RefOutput included"
40-
echo " --system_config=<CONFIG> System configuration to select from the Vela configuration file (see vela.ini). Default: Ethos_U55_High_End_Embedded for EthosU55 targets, Ethos_U85_SYS_DRAM_Mid for EthosU85 targets."
41-
echo " NOTE: If given, this option must match the given target. This option along with the memory_mode sets timing adapter values customized for specific hardware, see ./executor_runner/CMakeLists.txt."
42-
echo " --memory_mode=<CONFIG> Vela memory mode, used for setting the Timing Adapter parameters of the Corstone platforms."
43-
echo " Valid values are Shared_Sram(for Ethos-U55, Ethos-U65, Ethos-85), Sram_Only(for Ethos-U55, Ethos-U65, Ethos-U85) or Dedicated_Sram(for Ethos-U65, Ethos-U85)."
44-
echo " Default: Shared_Sram for the Ethos-U55 and Sram_Only for the Ethos-U85"
45-
echo " --etdump Adds Devtools etdump support to track timing, etdump area will be base64 encoded in the log"
46-
echo " --extra_build_flags=<FLAGS> Extra flags to pass to cmake like -DET_ARM_BAREMETAL_METHOD_ALLOCATOR_POOL_SIZE=60000 Default: none "
47-
echo " --output=<FOLDER> Output folder Default: <MODEL>/<MODEL>_<TARGET INFO>.pte"
48-
echo " --et_build_root=<FOLDER> Build output root folder to use, defaults to ${et_build_root}"
49-
echo " --ethosu_tools_dir=<FOLDER> Path to your Ethos-U tools dir if you not using default: ${ethosu_tools_dir}"
50-
echo " --toolchain=<TOOLCHAIN> Toolchain can be specified (e.g. bare metal as arm-none-eabi-gcc or zephyr as arm-zephyr-eabi-gcc Default: ${toolchain}"
51-
echo " --select_ops_list=<OPS> Comma separated list of portable (non delagated) kernels to include Default: ${select_ops_list}"
52-
echo " NOTE: This is used when select_ops_model is not possible to use, e.g. for semihosting or bundleio."
53-
echo " See https://docs.pytorch.org/executorch/stable/kernel-library-selective-build.html for more information."
36+
echo " --pte=<PTE_FILE>|<ADDR>|semihosting Set to a pte file (generated by the aot_arm_compier) to include the model in the elf."
37+
echo " Or a hex address in the format of 0x00000000 if placed in memory you need to place it on this ADDR on your target, with your flash tool or other means."
38+
echo " Or specify the word 'semihosting' to supply pte at runtime."
39+
echo " --target=<TARGET> Target to build and run for Default: ${target}"
40+
echo " --build_type=<TYPE> Build with Release, Debug or RelWithDebInfo, default is ${build_type}"
41+
echo " --bundleio Support both pte and Bundle IO bpte using Devtools BundelIO with Input/RefOutput included"
42+
echo " --system_config=<CONFIG> System configuration to select from the Vela configuration file (see vela.ini). Default: Ethos_U55_High_End_Embedded for EthosU55 targets, Ethos_U85_SYS_DRAM_Mid for EthosU85 targets."
43+
echo " NOTE: If given, this option must match the given target. This option along with the memory_mode sets timing adapter values customized for specific hardware, see ./executor_runner/CMakeLists.txt."
44+
echo " --memory_mode=<CONFIG> Vela memory mode, used for setting the Timing Adapter parameters of the Corstone platforms."
45+
echo " Valid values are Shared_Sram(for Ethos-U55, Ethos-U65, Ethos-85), Sram_Only(for Ethos-U55, Ethos-U65, Ethos-U85) or Dedicated_Sram(for Ethos-U65, Ethos-U85)."
46+
echo " Default: Shared_Sram for the Ethos-U55 and Sram_Only for the Ethos-U85"
47+
echo " --etdump Adds Devtools etdump support to track timing, etdump area will be base64 encoded in the log"
48+
echo " --extra_build_flags=<FLAGS> Extra flags to pass to cmake like -DET_ARM_BAREMETAL_METHOD_ALLOCATOR_POOL_SIZE=60000 Default: none "
49+
echo " --output=<FOLDER> Output folder Default: <MODEL>/<MODEL>_<TARGET INFO>.pte"
50+
echo " --et_build_root=<FOLDER> Build output root folder to use, defaults to ${et_build_root}"
51+
echo " --ethosu_tools_dir=<FOLDER> Path to your Ethos-U tools dir if you not using default: ${ethosu_tools_dir}"
52+
echo " --toolchain=<TOOLCHAIN> Toolchain can be specified (e.g. bare metal as arm-none-eabi-gcc or zephyr as arm-zephyr-eabi-gcc Default: ${toolchain}"
53+
echo " --select_ops_list=<OPS> Comma separated list of portable (non delagated) kernels to include Default: ${select_ops_list}"
54+
echo " NOTE: This is used when select_ops_model is not possible to use, e.g. for semihosting or bundleio."
55+
echo " See https://docs.pytorch.org/executorch/stable/kernel-library-selective-build.html for more information."
5456
exit 0
5557
}
5658

@@ -94,10 +96,24 @@ toolchain_cmake=$(realpath ${toolchain_cmake})
9496
source ${setup_path_script}
9597

9698
if [[ ${pte_file} == "semihosting" ]]; then
97-
extra_build_flags="${extra_build_flags} -DSEMIHOSTING=ON"
99+
pte_data="-DSEMIHOSTING=ON"
98100
else
99-
pte_file=$(realpath ${pte_file})
100-
extra_build_flags="${extra_build_flags} -DET_PTE_FILE_PATH:PATH='${pte_file}'"
101+
if [[ "$pte_file" =~ ^0x[0-9a-fA-F]{1,16}$ ]]; then
102+
echo "PTE in memory at ${pte_file}, make sure to put it there on your target before starting."
103+
pte_data="-DET_MODEL_PTE_ADDR=${pte_file}"
104+
if [ "$output_folder_set" = false ] ; then
105+
# Not locked down to a PTE use
106+
output_folder=${et_build_root}/${target}_${pte_file}/cmake-out
107+
fi
108+
else
109+
echo "PTE included in elf from file ${pte_file}"
110+
pte_file=$(realpath ${pte_file})
111+
pte_data="-DET_PTE_FILE_PATH:PATH=${pte_file}"
112+
if [ "$output_folder_set" = false ] ; then
113+
# remove file ending
114+
output_folder=${pte_file%.*}/cmake-out
115+
fi
116+
fi
101117
fi
102118
ethosu_tools_dir=$(realpath ${ethosu_tools_dir})
103119
ethos_u_root_dir="$ethosu_tools_dir/ethos-u"
@@ -108,11 +124,6 @@ et_build_dir=${et_build_root}/cmake-out
108124
mkdir -p ${et_build_dir}
109125
et_build_dir=$(realpath ${et_build_dir})
110126

111-
if [ "$output_folder_set" = false ] ; then
112-
# remove file ending
113-
output_folder=${pte_file%.*}/cmake-out
114-
fi
115-
116127
if [[ ${system_config} == "" ]]
117128
then
118129
system_config="Ethos_U55_High_End_Embedded"
@@ -140,7 +151,7 @@ else
140151
target_cpu=cortex-m85
141152
fi
142153
echo "--------------------------------------------------------------------------------"
143-
echo "Build Arm ${toolchain/-gcc/} executor_runner for ${target} with ${pte_file} using ${system_config} ${memory_mode} ${extra_build_flags} to '${output_folder}'"
154+
echo "Build Arm ${toolchain/-gcc/} executor_runner for ${target} PTE: ${pte_file} using ${system_config} ${memory_mode} ${extra_build_flags} to '${output_folder}'"
144155
echo "--------------------------------------------------------------------------------"
145156

146157
cd ${et_root_dir}/examples/arm/executor_runner
@@ -162,6 +173,7 @@ cmake \
162173
-DET_BUILD_DIR_PATH:PATH=${et_build_dir} \
163174
-DETHOS_SDK_PATH:PATH=${ethos_u_root_dir} \
164175
-DETHOSU_TARGET_NPU_CONFIG=${target} \
176+
${pte_data} \
165177
${build_bundleio_flags} \
166178
${build_with_etdump_flags} \
167179
-DPYTHON_EXECUTABLE=$(which python3) \

0 commit comments

Comments
 (0)