Update backends-xnnpack.md #10011

metascroy · 2025-04-09T20:52:41Z

Add import torch to sample code

* Add missing inplace operators to quantization annotator. * Add atol, rtol & qtol to test_pipeline classes. Co-authored-by: Oscar Andersson <[email protected]> Signed-off-by: Tom Allsop <[email protected]>

This causes commands to be echoed as they are executed, making it easier to track progress in CI.

Differential Revision: D71952366 Pull Request resolved: #9704

## Context There was a bug in the SPIR-V cache; essentially, once a shader variant is processed it will save the source GLSL to the cache. However, if another variant that uses the same source GLSL then checks the cache, it will think that the source file is unchanged and therefore will not recompile. To fix this, defer saving the source GLSL files only when all shaders have been compiled. Differential Revision: [D71973524](https://our.internmc.facebook.com/intern/diff/D71973524/)

…tion. (#9721) This diff adds support for the repeat operation in `add_copy_packed_dim_offset_node` function in the Vulkan backend for Executorch. The function now takes an additional boolean parameter, "repeat", which indicates whether the copy should wrap around the tensor dimension. `copy_packed_dim_offset` shader now has 2 functions `repeat_copy` and `no_repeat_copy` which are chosen based on specialization constant parameter. `no_repeat_copy` function has the legacy copy code. `repeat_copy` function reads input tensor's dim based on output pos and wraps it according to WHCB repetitions. Push constants `src_offset` and `dst_offset` contains source and destination's WHCB dimensions (and not copy offsets) respectively, when calling repeat function. Differential Revision: [D71477552](https://our.internmc.facebook.com/intern/diff/D71477552/)

going from .rst to md for the main docs page and cleaned up the content for easier use --------- Co-authored-by: Svetlana Karslioglu <[email protected]>

Differential Revision: D71949902 Pull Request resolved: #9703

- Handle lshift.Tensor and rshift.Tensor - Convert *.Scalar to *.Tensor - Test Scalar and Tensor cases with multiple dtypes - Move cast logic from node visitor to pass Signed-off-by: Erik Lundell <[email protected]>

…, in buck (#9511)" (#9727) This reverts commit b89328a. We will need to reapply this with internal build fixes.

### Summary Fixes #9023 Prevents the partitioner from handling ops with mixed dtypes. ### Test plan Unable to directly test due to auto-casting of dtypes and existing dtype checks in verifier.py.

Differential Revision: D71956172 Pull Request resolved: #9707

### Summary So that we can still build third-party projects. ### Test plan CI

For android demo, use links in pytorch-labs/executorch-examples Remove references to old demo app Add video

### Summary This change makes the re-download of ethos-u dependencies optional during cmake configutation when building the executorch_runner. If the user opts out it is assumed that the dependencies are already downloaded.

Differential Revision: D71988950 Pull Request resolved: #9724

Implement eq.Scalar by converting to eq.Tensor using replace_scalar_with_tensor_pass and match_arg_ranks_pass. * Convert eq.Tensor to eq.Scalar * Expand test_eq to test both eq.Tensor and eq.Scalar Signed-off-by: Fang-Ching <[email protected]>

By prepending rather than appending to CMAKE_CXX_FLAGS_RELEASE, it allows to specify another optimization level earlier in the build process and still have that take precedence over the -O2.

@yury-gorbachev

…Us, NPUs (#8573) ### Summary This PR introduces support for the OpenVINO backend in Executorch, enabling accelerated inference on Intel hardware, including CPU, GPU, and NPU devices. OpenVINO optimizes deep learning model performance by leveraging hardware-specific enhancements. The PR also introduces the OpenVINO quantizer with NNCF (Neural Network Compression Framework) for model optimization. The functionality has been tested on several torchvision and timm models, with plans to test and enable support for additional model types in the future. Below is a description of the features: - OpenVINO Backend Integration: The backends/openvino directory includes build scripts, AOT components (partitioner, preprocesser), OpenVINO Quantizer, and runtime backend files that register the OpenVINO backend, manage OpenVINO’s inference engine interactions, including model execution, device-specific optimizations, and backend initialization. It also contains tests for layers and models. See backends/openvino/README.md for usage. - OpenVINO Examples: The examples/openvino directory provides scripts for AOT optimization, quantization, and C++ executor examples. It includes instructions for optimizing the models, quantizing them, and exporting Executorch programs with OpenVINO optimizations. Refer to examples/openvino/README.md for details. - E2E Tutorial: Added an end-to-end tutorial in docs/source/build-run-openvino.md. ### Test plan This PR is tested with OpenVINO backend on Intel Core Ultra 7 processors for CPU, GPU, and NPU devices. To run the layer tests and model tests, please refer to backends/openvino/tests/README.md cc: @yury-gorbachev @alexsu52 @cavusmustafa @daniil-lyakhov @suryasidd @AlexKoff88 @MaximProshin @AlexanderDokuchaev --------- Co-authored-by: Cavus Mustafa <[email protected]> Co-authored-by: Aleksandr Suslov <[email protected]> Co-authored-by: dlyakhov <[email protected]> Co-authored-by: Kimish Patel <[email protected]> Co-authored-by: suryasidd <[email protected]>

@larryliu0820

…9751) We have a lot of `ET_CHECK_OR_RETURN_FALSE` that log a condition, but not the values of the variables in that condition. This is an attempt to improve debuggability of these errors. cc @larryliu0820 @manuelcandales

### Summary There is a bug when there is a constant_pad between two convolutions. In order to minimize permutes associated with memory format changes, we sometimes compute ops in NHWC. This is the case for ConstantPad when it is between two convs: ``` a = conv(a) a = constant_pad(a, paddings=[1, 2, 3, 4]) a = conv(a) ``` in this case we need to make sure the paddings given to constant_pad are also permuted to nhwc. ### Test plan python install_executorch.py --editable python -m unittest backends.xnnpack.test.ops.test_static_constant_pad.TestStaticConstantPad.test_fp32_static_constant_pad_nhwc

Differential Revision: D71839099 Pull Request resolved: #9599

Differential Revision: D72007597 Pull Request resolved: #9728

Differential Revision: D72091091 Pull Request resolved: #9755

Differential Revision: D70529392 Pull Request resolved: #8909

…8724) Summary: - To improve readability, replace is_bert_ with is_bert()

This fix a problem with the code continue running efter a faild allocation and sync up the example to better match examples/devtools/example_runner/example_runner.cpp Signed-off-by: Zingo Andersen <[email protected]>

- Rename bitwise and logical tests with full aten op name - Refactor the tests with test_pipeline and new Xfail decorator - Fix the naming error in test_any Signed-off-by: Yufeng Shi <[email protected]>

Summary: Now that we have better dim order support, we can pass inputs into models in non-default dim orders, such as channels last. This works in the runtime, but the python layer current asserts that input tensors are in default dim order. This PR relaxes this restriction to allow dense input tensors with alternative dim orders. Differential Revision: D71716100

Update PTD serialization to account for blobs from the NamedDataStoreOutput. Something we can do in the future is to consolidate tensors (that go through the emitter) and blobs (that come from the NamedDataStore). Differential Revision: [D70939807](https://our.internmc.facebook.com/intern/diff/D70939807/)

Differential Revision: D72679075 Pull Request resolved: #9985

@Manual

…le used for Edge export (#9938) Summary: ## Context Addresses this [release blocker](https://github.com/orgs/pytorch/projects/99/views/1?pane=issue&itemId=104088363&issue=pytorch%7Cpytorch%7C150207) issue. Some models cannot export because they use `linalg_vector_norm` which is not currently an ATen operator. I initially tried adding the op to the core decomp table, but the decomp is not passing pytorch correctness tests. Please see pytorch/pytorch#150241 for more details. ## Changes Since we currently cannot include the op in PyTorch's decomp table, instead we can insert the op into the edge decomp table directly. This PR is a simple change to add `linalg_vector_norm` directly to the edge decomp table. Test Plan: Tested exporting and running a model with the `linalg_vector_norm` op via the following script. ``` import torch from executorch.exir import to_edge_transform_and_lower, EdgeCompileConfig from torch.export import Dim, export from executorch.extension.pybindings.portable_lib import ( # @Manual _load_for_executorch_from_buffer, ) class Model(torch.nn.Module): def __init__(self): super().__init__() def forward(self, x): return torch.linalg.vector_norm(x, 2) model = Model() inputs = (torch.randn(1,1,16,16),) dynamic_shapes = { "x": { 2: Dim("h", min=16, max=1024), 3: Dim("w", min=16, max=1024), } } exported_program = export(model, inputs, dynamic_shapes=dynamic_shapes) executorch_program = to_edge_transform_and_lower( exported_program, compile_config=EdgeCompileConfig(_check_ir_validity=False), ).to_executorch() executorch_module = _load_for_executorch_from_buffer( executorch_program.buffer ) model_output = executorch_module.run_method( "forward", tuple(inputs) ) print(model_output) ```

`extension.llm.tokenizer.tokenizer` -> `pytorch_tokenizers.tools.llama2c.convert`

As titled.

pytorch-bot · 2025-04-09T20:52:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10011

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-04-09T20:53:32Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

tom-arm and others added 30 commits March 27, 2025 16:19

Arm backend: Add MobileNet v3 testcase (#9223)

8ee280b

* Add missing inplace operators to quantization annotator. * Add atol, rtol & qtol to test_pipeline classes. Co-authored-by: Oscar Andersson <[email protected]> Signed-off-by: Tom Allsop <[email protected]>

set -x in build_apple_frameworks.sh (#9326)

01f5599

This causes commands to be echoed as they are executed, making it easier to track progress in CI.

Move ExecutorchRuntimeBridge into fb dir.

a4925e4

Differential Revision: D71952366 Pull Request resolved: #9704

convert to .md and clean up content on main docs page (#9663)

879b94f

going from .rst to md for the main docs page and cleaned up the content for easier use --------- Co-authored-by: Svetlana Karslioglu <[email protected]>

Support alpha in scalar add/sub cases

895efcb

Differential Revision: D71949902 Pull Request resolved: #9703

Arm backend: Clean up shift support (#9573)

b0c0fda

- Handle lshift.Tensor and rshift.Tensor - Convert *.Scalar to *.Tensor - Test Scalar and Tensor cases with multiple dtypes - Move cast logic from node visitor to pass Signed-off-by: Erik Lundell <[email protected]>

Revert "Depend on extension/threadpool, not thread_parallel_interface…

5531a0e

…, in buck (#9511)" (#9727) This reverts commit b89328a. We will need to reapply this with internal build fixes.

Add mixed dtype check for XNNPACK partitioner (#9612)

ec1cd04

### Summary Fixes #9023 Prevents the partitioner from handling ops with mixed dtypes. ### Test plan Unable to directly test due to auto-casting of dtypes and existing dtype checks in verifier.py.

Add a convenient constructor

8b948e8

Differential Revision: D71956172 Pull Request resolved: #9707

Pin cmake version < 4.0.0 (#9732)

399a255

### Summary So that we can still build third-party projects. ### Test plan CI

Improve android related docs

2d01dfc

For android demo, use links in pytorch-labs/executorch-examples Remove references to old demo app Add video

Fix CoreML pybinding module

f174d55

Differential Revision: D71988950 Pull Request resolved: #9724

Deprioritize top level -O2 in CMAKE_CXX_FLAGS_RELEASE (#9394)

65ebabb

By prepending rather than appending to CMAKE_CXX_FLAGS_RELEASE, it allows to specify another optimization level earlier in the build process and still have that take precedence over the -O2.

[Release 0.6] update version.txt (#9744)

6bc3e34

Add bf16 support to unary_ufunc_realh

eee2bf1

Differential Revision: D71839099 Pull Request resolved: #9599

Remove old tokenizer/ directory in ExecuTorch

eda319e

Differential Revision: D72007597 Pull Request resolved: #9728

Update mimi export test

2aa7748

Differential Revision: D72091091 Pull Request resolved: #9755

Introduce missing APIs to lower ExportedProgram objects directly

5da974a

Differential Revision: D70529392 Pull Request resolved: #8909

Qualcomm AI Engine Direct - Replace private variable with function (#…

eef0010

…8724) Summary: - To improve readability, replace is_bert_ with is_bert()

Arm backend: Check memory allocation on target (#9735)

69cc7fa

This fix a problem with the code continue running efter a faild allocation and sync up the example to better match examples/devtools/example_runner/example_runner.cpp Signed-off-by: Zingo Andersen <[email protected]>

Arm backend: Refactor any, bitwise, logical tests (#9499)

bad2fa9

- Rename bitwise and logical tests with full aten op name - Refactor the tests with test_pipeline and new Xfail decorator - Fix the naming error in test_any Signed-off-by: Yufeng Shi <[email protected]>

kirklandsign and others added 6 commits April 8, 2025 19:20

Just build AAR in place

f28b5db

Differential Revision: D72679075 Pull Request resolved: #9985

Add maven version in main in Getting Started page (#9980)

9b7a878

[doc] Fix tokenizer related documentation (#10000)

3a940da

`extension.llm.tokenizer.tokenizer` -> `pytorch_tokenizers.tools.llama2c.convert`

Fix tokenizer convert in xnnpack_README.md (#10003)

20c7047

As titled.

Update backends-xnnpack.md

3f04e80

metascroy requested a review from mergennachin as a code owner April 9, 2025 20:52

metascroy requested a review from GregoryComer April 9, 2025 20:52

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 9, 2025

Update backends-xnnpack.md

0a3416f

metascroy changed the base branch from main to release/0.6 April 9, 2025 21:07

metascroy requested review from Gasoonjia, JacobSzwejbka, SS-JIA, cccclai, digantdesai, iseeyuan, jackzhxng, kimishpatel, kirklandsign, larryliu0820, lucylq, manuelcandales, mcr229, shoumikhin, swolchok and tarun292 as code owners April 9, 2025 21:07

metascroy closed this Apr 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update backends-xnnpack.md #10011

Update backends-xnnpack.md #10011

Uh oh!

metascroy commented Apr 9, 2025

Uh oh!

pytorch-bot bot commented Apr 9, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Update backends-xnnpack.md #10011

Update backends-xnnpack.md #10011

Uh oh!

Conversation

metascroy commented Apr 9, 2025

Uh oh!

pytorch-bot bot commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10011

Uh oh!

github-actions bot commented Apr 9, 2025

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

pytorch-bot bot commented Apr 9, 2025 •

edited

Loading

This PR needs a `release notes:` label