Skip to content

Commit 2706db0

Browse files
committed
Update
[ghstack-poisoned]
2 parents 70b980c + e38c077 commit 2706db0

File tree

105 files changed

+3109
-1528
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

105 files changed

+3109
-1528
lines changed

.ci/scripts/build-qnn-sdk.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,10 @@ set -o xtrace
1111

1212
build_qnn_backend() {
1313
echo "Start building qnn backend."
14-
export ANDROID_NDK_ROOT=${ANDROID_NDK_ROOT:-/opt/ndk}
15-
export QNN_SDK_ROOT=${QNN_SDK_ROOT:-/tmp/qnn/2.28.0.241029}
14+
# Source QNN configuration
15+
source "$(dirname "${BASH_SOURCE[0]}")/../../backends/qualcomm/scripts/install_qnn_sdk.sh"
16+
setup_android_ndk
17+
install_qnn
1618
export EXECUTORCH_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/../.." && pwd)"
1719

1820
parallelism=$(( $(nproc) - 1 ))

.ci/scripts/setup-qnn-deps.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@ set -ex
1010
source "$(dirname "${BASH_SOURCE[0]}")/../../backends/qualcomm/scripts/install_qnn_sdk.sh"
1111

1212
setup_libcpp 12
13+
setup_android_ndk
1314
install_qnn

.ci/scripts/test_llama.sh

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,8 +119,12 @@ echo "COREML option ${COREML}"
119119

120120
if [[ "${MODE}" =~ .*qnn.* ]]; then
121121
QNN=ON
122+
123+
# Download QNN_SDK. If already downloaded, export environment path
124+
source "$(dirname "${BASH_SOURCE[0]}")/../../backends/qualcomm/scripts/install_qnn_sdk.sh"
125+
install_qnn
126+
122127
export EXECUTORCH_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
123-
export QNN_SDK_ROOT=/tmp/qnn/2.28.0.241029
124128
export LD_LIBRARY_PATH="${QNN_SDK_ROOT}/lib/x86_64-linux-clang"
125129
export PYTHONPATH=".."
126130
cp schema/program.fbs exir/_serialize/program.fbs

.ci/scripts/test_qnn_static_llama.sh

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,13 @@ set -euxo pipefail
99

1010
source "$(dirname "${BASH_SOURCE[0]}")/utils.sh"
1111

12+
# Source QNN configuration
13+
source "$(dirname "${BASH_SOURCE[0]}")/../../backends/qualcomm/scripts/qnn_config.sh"
14+
# Download QNN_SDK. If already downloaded, export environment path
15+
source "$(dirname "${BASH_SOURCE[0]}")/../../backends/qualcomm/scripts/install_qnn_sdk.sh"
16+
install_qnn
17+
1218
export EXECUTORCH_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
13-
export QNN_SDK_ROOT=/tmp/qnn/2.28.0.241029
1419
export LD_LIBRARY_PATH="${QNN_SDK_ROOT}/lib/x86_64-linux-clang"
1520
export PYTHONPATH=".."
1621
cp schema/program.fbs exir/_serialize/program.fbs

.github/workflows/android-perf.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -292,7 +292,7 @@ jobs:
292292
export.output_name="${OUT_ET_MODEL_NAME}.pte"
293293
ls -lh "${OUT_ET_MODEL_NAME}.pte"
294294
elif [[ ${{ matrix.config }} == "llama3_qnn_htp" ]]; then
295-
export QNN_SDK_ROOT=/tmp/qnn/2.28.0.241029
295+
export QNN_SDK_ROOT=/tmp/qnn/2.37.0.25072
296296
export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib/x86_64-linux-clang/
297297
export PYTHONPATH=$(pwd)/..
298298
@@ -432,7 +432,7 @@ jobs:
432432
PYTHON_EXECUTABLE=python bash .ci/scripts/build-qnn-sdk.sh
433433
434434
mkdir -p aar-out
435-
PYTHON_EXECUTABLE=python ANDROID_ABIS="arm64-v8a" BUILD_AAR_DIR=aar-out EXECUTORCH_BUILD_QNN=ON QNN_SDK_ROOT=/tmp/qnn/2.28.0.241029 EXECUTORCH_ANDROID_PROFILING=ON bash scripts/build_android_library.sh
435+
PYTHON_EXECUTABLE=python ANDROID_ABIS="arm64-v8a" BUILD_AAR_DIR=aar-out EXECUTORCH_BUILD_QNN=ON QNN_SDK_ROOT=/tmp/qnn/2.37.0.25072 EXECUTORCH_ANDROID_PROFILING=ON bash scripts/build_android_library.sh
436436
mkdir -p extension/benchmark/android/benchmark/app/libs
437437
cp aar-out/executorch.aar extension/benchmark/android/benchmark/app/libs
438438
pushd extension/benchmark/android/benchmark

.github/workflows/android-release-artifacts.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ jobs:
104104
source backends/qualcomm/scripts/qnn_config.sh
105105
export QNN_SDK_ROOT="/tmp/qnn/${QNN_VERSION}"
106106
export ANDROID_ABIS=arm64-v8a
107-
GRADLE_ARGS+=" -DqnnVersion=2.28.0"
107+
GRADLE_ARGS+=" -DqnnVersion=2.37.0"
108108
fi
109109
110110
# Build AAR Package

.github/workflows/apple-perf.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ jobs:
230230
model.use_sdpa_with_kv_cache=true \
231231
backend.xnnpack.enabled=true \
232232
backend.xnnpack.extended_ops=true \
233-
base.preq_mode="8da4w_output_8da8w" \
233+
base.preq_mode="preq_8da4w_out_8da8w" \
234234
base.preq_group_size=32 \
235235
export.max_seq_length=2048 \
236236
export.max_context_length=2048 \
@@ -256,7 +256,7 @@ jobs:
256256
base.params="${DOWNLOADED_PATH}/params.json" \
257257
quantization.use_qat=true \
258258
base.use_lora=16 \
259-
base.preq_mode="8da4w_output_8da8w" \
259+
base.preq_mode="preq_8da4w_out_8da8w" \
260260
base.preq_group_size=32 \
261261
base.preq_embedding_quantize=\'8,0\' \
262262
model.use_sdpa_with_kv_cache=true \

backends/apple/mps/serialization/mps_graph_serialize.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
# Copyright (c) Meta Platforms, Inc. and affiliates.
22
# All rights reserved.
3+
# Copyright 2025 Arm Limited and/or its affiliates.
34
#
45
# This source code is licensed under the BSD-style license found in the
56
# LICENSE file in the root directory of this source tree.
67

8+
import importlib.resources as _resources
79
import json
810
import os
911
import tempfile
1012

11-
import pkg_resources
13+
import executorch.backends.apple.mps.serialization as serialization_package
1214
from executorch.backends.apple.mps.serialization.mps_graph_schema import MPSGraph
1315
from executorch.exir._serialize._dataclass import _DataclassEncoder
1416
from executorch.exir._serialize._flatbuffer import _flatc_compile
@@ -19,7 +21,9 @@ def convert_to_flatbuffer(mps_graph: MPSGraph) -> bytes:
1921
with tempfile.TemporaryDirectory() as d:
2022
schema_path = os.path.join(d, "schema.fbs")
2123
with open(schema_path, "wb") as schema_file:
22-
schema_file.write(pkg_resources.resource_string(__name__, "schema.fbs"))
24+
schema_file.write(
25+
_resources.read_binary(serialization_package, "schema.fbs")
26+
)
2327
json_path = os.path.join(d, "schema.json")
2428
with open(json_path, "wb") as json_file:
2529
json_file.write(mps_graph_json.encode("ascii"))

backends/arm/CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,10 @@ if(EXECUTORCH_BUILD_VGF)
7373
# vgf backend
7474
list(TRANSFORM _vgf_backend_sources PREPEND "${EXECUTORCH_ROOT}/")
7575
add_library(vgf_backend ${_vgf_backend_sources})
76+
install(TARGETS vgf_backend EXPORT ExecuTorchTargets)
7677
target_include_directories(
77-
vgf_backend PUBLIC ${_common_include_directories} ${VULKAN_HEADERS_PATH}
78-
${VOLK_HEADERS_PATH}
78+
vgf_backend PRIVATE ${_common_include_directories} ${VULKAN_HEADERS_PATH}
79+
${VOLK_HEADERS_PATH}
7980
)
8081
target_compile_options(
8182
vgf_backend PRIVATE -DUSE_VULKAN_WRAPPER -DUSE_VULKAN_VOLK

backends/arm/README.md

Lines changed: 48 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,19 @@ You can test to run some models with the full fvp test flow
8888
backends/arm/test/test_arm_baremetal.sh test_full_ethosu_fvp
8989
```
9090

91+
To run the unit test suite with VKML use the following. Note Vulkan SDK need to be installed.
92+
Have a look at install_vulkan_sdk() in .ci/scripts/setup-vulkan-linux-deps.sh on how to install Vulkan SDK.
93+
94+
```
95+
backends/arm/test/test_arm_baremetal.sh test_pytest_vkml
96+
```
97+
98+
You can test to run some models with the full VKML flow
99+
100+
```
101+
backends/arm/test/test_arm_baremetal.sh test_full_vkml
102+
```
103+
91104
## Unit tests
92105

93106
This is the structure of the test directory
@@ -102,6 +115,7 @@ test # Root test folder
102115
├── tosautil # Utility functions for TOSA artifacts
103116
├ common.py # Common functions and definitions used by many tests
104117
├ setup_testing.sh # Script to prepare testing for using the Corstone 3x0 FVP
118+
├ setup_testing_vkml.sh # Script to prepare testing for using the VKML
105119
├ test_arm_baremetal.sh # Help script to trigger testing
106120
```
107121

@@ -123,7 +137,7 @@ first you need to build and prepare some used target libs
123137

124138
```
125139
examples/arm/run.sh --model_name=add --build_only
126-
backends/arm/test/setup_testing.sh
140+
backends/arm/test/setup_testing.sh and/or backends/arm/test/setup_testing_vkml.sh
127141
```
128142

129143
The you can run the tests with
@@ -195,6 +209,38 @@ List of model specific and optional passes:
195209
- InsertCastForOpsWithInt64InputPass
196210
- Functionality:
197211
- For LLMs such as LLama, some opeartors like aten.embedding have int64 input. In order to lower these operators to TOSA, this pass will insert a casting node that converts the input from int64 to int32.
198-
- Example usage: backends/arm/test/models/test_llama.py
199212
- Supported Ops:
200213
- aten.embedding.default, aten.slice_copy.Tensor
214+
- Example usage:
215+
- backends/arm/test/models/test_llama.py
216+
217+
- ConvertInt64ConstOpsToInt32Pass
218+
- Functionalities:
219+
- Rewrites constant-producing ops that output int64 to instead output int32, when values are within int32 bounds.
220+
- Supported Ops:
221+
- `torch.full`, `torch.arange`, `torch.eye`, `torch.linspace`, `torch.tensor`
222+
- Example usage:
223+
- backends/arm/test/models/stable_diffusion/test_CLIPTextModelWithProjection.py
224+
- backends/arm/test/models/stable_diffusion/test_T5EncoderModel.py
225+
226+
- ConvertInt64OutputOpsToInt32Pass
227+
- Overview:
228+
- Rewrites or removes operations that produce int64 outputs, converting them to int32 where possible.
229+
- Overflow checks are applied selectively; for ops without such checks, users need to ensure values fit within the int32 range.
230+
- Functionalities:
231+
1. Handling casting to int64:
232+
- (1) int32 -> int64:
233+
- Removes the cast and redirect uses of int64 to int32
234+
- (2) other types -> int64:
235+
- Rewrites the cast to other types -> int32
236+
- Supported Ops:
237+
- torch.ops.aten.to.\[dtype|dtype_layout\]
238+
- exir_ops.edge.dim_order_ops._to_dim_order_copy.default
239+
2. Post-process argmax outputs:
240+
- Inserts an int64->int32 cast after the argmax operations that produce int64 outputs:
241+
- Supported Ops:
242+
- torch.ops.aten.argmax.default
243+
- exir_ops.edge.aten.argmax.default
244+
- Example usage:
245+
- (Functionality 1) backends/arm/test/models/stable_diffusion/test_T5EncoderModel.py
246+
- (Functionality 2) backends/arm/test/models/stable_diffusion/test_CLIPTextModelWithProjection.py

0 commit comments

Comments
 (0)