-
Notifications
You must be signed in to change notification settings - Fork 749
Update backends-xnnpack.md #10011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update backends-xnnpack.md #10011
Conversation
* Add missing inplace operators to quantization annotator. * Add atol, rtol & qtol to test_pipeline classes. Co-authored-by: Oscar Andersson <[email protected]> Signed-off-by: Tom Allsop <[email protected]>
This causes commands to be echoed as they are executed, making it easier to track progress in CI.
Differential Revision: D71952366 Pull Request resolved: #9704
## Context There was a bug in the SPIR-V cache; essentially, once a shader variant is processed it will save the source GLSL to the cache. However, if another variant that uses the same source GLSL then checks the cache, it will think that the source file is unchanged and therefore will not recompile. To fix this, defer saving the source GLSL files only when all shaders have been compiled. Differential Revision: [D71973524](https://our.internmc.facebook.com/intern/diff/D71973524/)
…tion. (#9721) This diff adds support for the repeat operation in `add_copy_packed_dim_offset_node` function in the Vulkan backend for Executorch. The function now takes an additional boolean parameter, "repeat", which indicates whether the copy should wrap around the tensor dimension. `copy_packed_dim_offset` shader now has 2 functions `repeat_copy` and `no_repeat_copy` which are chosen based on specialization constant parameter. `no_repeat_copy` function has the legacy copy code. `repeat_copy` function reads input tensor's dim based on output pos and wraps it according to WHCB repetitions. Push constants `src_offset` and `dst_offset` contains source and destination's WHCB dimensions (and not copy offsets) respectively, when calling repeat function. Differential Revision: [D71477552](https://our.internmc.facebook.com/intern/diff/D71477552/)
going from .rst to md for the main docs page and cleaned up the content for easier use --------- Co-authored-by: Svetlana Karslioglu <[email protected]>
Differential Revision: D71949902 Pull Request resolved: #9703
- Handle lshift.Tensor and rshift.Tensor - Convert *.Scalar to *.Tensor - Test Scalar and Tensor cases with multiple dtypes - Move cast logic from node visitor to pass Signed-off-by: Erik Lundell <[email protected]>
### Summary Fixes #9023 Prevents the partitioner from handling ops with mixed dtypes. ### Test plan Unable to directly test due to auto-casting of dtypes and existing dtype checks in verifier.py.
Differential Revision: D71956172 Pull Request resolved: #9707
### Summary So that we can still build third-party projects. ### Test plan CI
For android demo, use links in pytorch-labs/executorch-examples Remove references to old demo app Add video
### Summary This change makes the re-download of ethos-u dependencies optional during cmake configutation when building the executorch_runner. If the user opts out it is assumed that the dependencies are already downloaded.
Differential Revision: D71988950 Pull Request resolved: #9724
Implement eq.Scalar by converting to eq.Tensor using replace_scalar_with_tensor_pass and match_arg_ranks_pass. * Convert eq.Tensor to eq.Scalar * Expand test_eq to test both eq.Tensor and eq.Scalar Signed-off-by: Fang-Ching <[email protected]>
By prepending rather than appending to CMAKE_CXX_FLAGS_RELEASE, it allows to specify another optimization level earlier in the build process and still have that take precedence over the -O2.
…Us, NPUs (#8573) ### Summary This PR introduces support for the OpenVINO backend in Executorch, enabling accelerated inference on Intel hardware, including CPU, GPU, and NPU devices. OpenVINO optimizes deep learning model performance by leveraging hardware-specific enhancements. The PR also introduces the OpenVINO quantizer with NNCF (Neural Network Compression Framework) for model optimization. The functionality has been tested on several torchvision and timm models, with plans to test and enable support for additional model types in the future. Below is a description of the features: - OpenVINO Backend Integration: The backends/openvino directory includes build scripts, AOT components (partitioner, preprocesser), OpenVINO Quantizer, and runtime backend files that register the OpenVINO backend, manage OpenVINO’s inference engine interactions, including model execution, device-specific optimizations, and backend initialization. It also contains tests for layers and models. See backends/openvino/README.md for usage. - OpenVINO Examples: The examples/openvino directory provides scripts for AOT optimization, quantization, and C++ executor examples. It includes instructions for optimizing the models, quantizing them, and exporting Executorch programs with OpenVINO optimizations. Refer to examples/openvino/README.md for details. - E2E Tutorial: Added an end-to-end tutorial in docs/source/build-run-openvino.md. ### Test plan This PR is tested with OpenVINO backend on Intel Core Ultra 7 processors for CPU, GPU, and NPU devices. To run the layer tests and model tests, please refer to backends/openvino/tests/README.md cc: @yury-gorbachev @alexsu52 @cavusmustafa @daniil-lyakhov @suryasidd @AlexKoff88 @MaximProshin @AlexanderDokuchaev --------- Co-authored-by: Cavus Mustafa <[email protected]> Co-authored-by: Aleksandr Suslov <[email protected]> Co-authored-by: dlyakhov <[email protected]> Co-authored-by: Kimish Patel <[email protected]> Co-authored-by: suryasidd <[email protected]>
…9751) We have a lot of `ET_CHECK_OR_RETURN_FALSE` that log a condition, but not the values of the variables in that condition. This is an attempt to improve debuggability of these errors. cc @larryliu0820 @manuelcandales
### Summary
There is a bug when there is a constant_pad between two convolutions. In
order to minimize permutes associated with memory format changes, we
sometimes compute ops in NHWC. This is the case for ConstantPad when it
is between two convs:
```
a = conv(a)
a = constant_pad(a, paddings=[1, 2, 3, 4])
a = conv(a)
```
in this case we need to make sure the paddings given to constant_pad are
also permuted to nhwc.
### Test plan
python install_executorch.py --editable
python -m unittest
backends.xnnpack.test.ops.test_static_constant_pad.TestStaticConstantPad.test_fp32_static_constant_pad_nhwc
Differential Revision: D71839099 Pull Request resolved: #9599
Differential Revision: D72007597 Pull Request resolved: #9728
Differential Revision: D72091091 Pull Request resolved: #9755
Differential Revision: D70529392 Pull Request resolved: #8909
…8724) Summary: - To improve readability, replace is_bert_ with is_bert()
This fix a problem with the code continue running efter a faild allocation and sync up the example to better match examples/devtools/example_runner/example_runner.cpp Signed-off-by: Zingo Andersen <[email protected]>
- Rename bitwise and logical tests with full aten op name - Refactor the tests with test_pipeline and new Xfail decorator - Fix the naming error in test_any Signed-off-by: Yufeng Shi <[email protected]>
Summary: Now that we have better dim order support, we can pass inputs into models in non-default dim orders, such as channels last. This works in the runtime, but the python layer current asserts that input tensors are in default dim order. This PR relaxes this restriction to allow dense input tensors with alternative dim orders. Differential Revision: D71716100
Update PTD serialization to account for blobs from the NamedDataStoreOutput. Something we can do in the future is to consolidate tensors (that go through the emitter) and blobs (that come from the NamedDataStore). Differential Revision: [D70939807](https://our.internmc.facebook.com/intern/diff/D70939807/)
Differential Revision: D72679075 Pull Request resolved: #9985
…le used for Edge export (#9938) Summary: ## Context Addresses this [release blocker](https://github.com/orgs/pytorch/projects/99/views/1?pane=issue&itemId=104088363&issue=pytorch%7Cpytorch%7C150207) issue. Some models cannot export because they use `linalg_vector_norm` which is not currently an ATen operator. I initially tried adding the op to the core decomp table, but the decomp is not passing pytorch correctness tests. Please see pytorch/pytorch#150241 for more details. ## Changes Since we currently cannot include the op in PyTorch's decomp table, instead we can insert the op into the edge decomp table directly. This PR is a simple change to add `linalg_vector_norm` directly to the edge decomp table. Test Plan: Tested exporting and running a model with the `linalg_vector_norm` op via the following script. ``` import torch from executorch.exir import to_edge_transform_and_lower, EdgeCompileConfig from torch.export import Dim, export from executorch.extension.pybindings.portable_lib import ( # @Manual _load_for_executorch_from_buffer, ) class Model(torch.nn.Module): def __init__(self): super().__init__() def forward(self, x): return torch.linalg.vector_norm(x, 2) model = Model() inputs = (torch.randn(1,1,16,16),) dynamic_shapes = { "x": { 2: Dim("h", min=16, max=1024), 3: Dim("w", min=16, max=1024), } } exported_program = export(model, inputs, dynamic_shapes=dynamic_shapes) executorch_program = to_edge_transform_and_lower( exported_program, compile_config=EdgeCompileConfig(_check_ir_validity=False), ).to_executorch() executorch_module = _load_for_executorch_from_buffer( executorch_program.buffer ) model_output = executorch_module.run_method( "forward", tuple(inputs) ) print(model_output) ```
`extension.llm.tokenizer.tokenizer` -> `pytorch_tokenizers.tools.llama2c.convert`
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10011
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Add import torch to sample code