Skip to content

Conversation

@agrima1304
Copy link
Collaborator

@agrima1304 agrima1304 commented Jul 30, 2025

Decomposes elu into other operators/ lookup table for MI/ BI case.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

@agrima1304 agrima1304 requested a review from digantdesai as a code owner July 30, 2025 15:18
@pytorch-bot
Copy link

pytorch-bot bot commented Jul 30, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12996

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 30, 2025
@agrima1304
Copy link
Collaborator Author

@pytorchbot label "partner: arm"

@pytorch-bot pytorch-bot bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jul 30, 2025
@agrima1304
Copy link
Collaborator Author

@pytorchbot label "release notes: arm"

@pytorch-bot pytorch-bot bot added the release notes: arm Changes to the ARM backend delegate label Jul 30, 2025
@agrima1304
Copy link
Collaborator Author

@pytorchbot label "ciflow/trunk"

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 30, 2025

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

Copy link
Collaborator

@Sebastian-Larsson Sebastian-Larsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI failure for Cortex-m is unrelated to this patch. Approved

Comment on lines +17 to +18
It has been set to 2 as the outputs seem to stay the same regardless of what
the value of input_scale is, as long as that value is not 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of input_scale is 1.0, however using the default value resulted in a type error. When passing 1 as an int, it was overridden by the default value (since both values are 1 and therefore equivalent ). So input_scale had to be changed to an int that is not 1.

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zingo
Copy link
Collaborator

zingo commented Jul 31, 2025

This will unfortunately need a rebase after the TOSA 0.80.1 removal :(

lucylq and others added 16 commits August 26, 2025 14:48
Differential Revision: D80067087

Pull Request resolved: pytorch#13320
Differential Revision: D79828275

Pull Request resolved: pytorch#13202
Add tests for the LSTM module. This is done in the context of
pytorch#12898.
Add tests for avgpooling operators. This is done in the context of
pytorch#12898.
Add tests for maxpooling operators. This is done in the context of
pytorch#12898.
Add tests for adaptive avgpooling operators. This is done in the context
of pytorch#12898.
Add tests for adaptive maxpooling operators. This is done in the context
of pytorch#12898.
Add a tester class implementation for Qualcomm and register the test
flow with the backend tester.

Note that QNN pybindings are planned but not yet functional.
Add some initial CSV report generation, detailing results and parameters
for each individual test. Delegation statistics and such will come next.
I've also added a basic test for the report generation, which I will
expand upon in this stack.

Here's some sample output from running add tests for XNNPACK:
```
Test ID,Test Case,Backend,Flow,Result,Dtype
test_add_dtype_float32_xnnpack,test_add_dtype,xnnpack,xnnpack,Success (Delegated),torch.float32
test_add_dtype_float32_xnnpack_static_int8,test_add_dtype,xnnpack,xnnpack_static_int8,Success (Delegated),torch.float32
test_add_f32_alpha_xnnpack,test_add_f32_alpha,xnnpack,xnnpack,Fail (Quantize),
test_add_f32_alpha_xnnpack_static_int8,test_add_f32_alpha,xnnpack,xnnpack_static_int8,Fail (Quantize),
test_add_f32_bcast_first_xnnpack,test_add_f32_bcast_first,xnnpack,xnnpack,Success (Delegated),
test_add_f32_bcast_first_xnnpack_static_int8,test_add_f32_bcast_first,xnnpack,xnnpack_static_int8,Success (Delegated),
test_add_f32_bcast_second_xnnpack,test_add_f32_bcast_second,xnnpack,xnnpack,Success (Delegated),
test_add_f32_bcast_second_xnnpack_static_int8,test_add_f32_bcast_second,xnnpack,xnnpack_static_int8,Success (Delegated),
test_add_f32_bcast_unary_xnnpack,test_add_f32_bcast_unary,xnnpack,xnnpack,Success (Delegated),
test_add_f32_bcast_unary_xnnpack_static_int8,test_add_f32_bcast_unary,xnnpack,xnnpack_static_int8,Success (Delegated),
```
### Summary
Overhaul the "Building from Source" doc page. The primarily intent of
these changes is to document CMake presets and the various build options
that we expose. However, I also did a pass on the existing contents of
the file to improve formatting and clarity.

I've re-organized the page to clearly delineate environment setup,
python install, and native build. It should flow better and be easier to
read.

### Test plan
I have built the docs locally to inspect the contents for formatting and
correctness.

Preview page:
https://docs-preview.pytorch.org/pytorch/executorch/13210/using-executorch-building-from-source.html
Live page (for comparison):
https://docs.pytorch.org/executorch/0.7/using-executorch-building-from-source.html
Differential Revision: D79268134

Pull Request resolved: pytorch#13004
Report various error statistics for the test outputs, including SQNR,
mean absolute error (MAE), and L2 norm. These are saved in the detail
report per test case.

As an example, here is the output from Core ML running MobileNet V2
(roughly formatted from csv -> sheets -> markdown):
```
Output 0 Error Max	Output 0 Error MAE	Output 0 Error MSD	Output 0 Error L2	Output 0 SQNR
0.0005887411535		0.0001199183663		2.32E-06		0.004750485188		41.28595734
```
Track and report the time taken to quantize and lower in the backend
test flow. Include this information in the generated report for each
test case.

Example output (from testing add operator):
Test ID | Test Case | Backend | Flow | Result | Quantize Time (s) |
Lowering Time (s)
-- | -- | -- | -- | -- | -- | --
test_add_dtype_float32_coreml | test_add_dtype | coreml | coreml |
Success (Delegated) |   | 0.69
test_add_dtype_float32_coreml_static_int8 | test_add_dtype | coreml |
coreml_static_int8 | Success (Delegated) | 8.73 | 0.88
### Summary
Turning on the `EXECUTORCH_ENABLE_EVENT_TRACER` option will enable event
tracing in the Wasm module API. The results can be obtained with the
`etdump()` method.

### Test plan
Added two tests depending on whether `EXECUTORCH_ENABLE_EVENT_TRACER` is
turned on or not. Added the `--enable-etdump` option to
`scripts/build_wasm_tests.sh` which turns on the above option.

Added configurations to the `unittest-wasm-bindings` CI test to run with
and without `--enable-etdump`.
Report total number of delegated and undelegated nodes and breakdown by
operator count.

Example from CoreML add:
Test ID | Test Case | Backend | Delegated Nodes | Undelegated Nodes |
Delegated Ops | Undelegated Ops
-- | -- | -- | -- | -- | -- | --
test_add_dtype_float32_coreml | test_add_dtype | coreml | 1 | 0 |
{'aten::add.Tensor': 1} | {}
test_add_dtype_float32_coreml_static_int8 | test_add_dtype | coreml | 7
| 0 | {'aten::add.Tensor': 1,
'quantized_decomposed::dequantize_per_tensor': 3,
'quantized_decomposed::quantize_per_tensor': 3} | {}
GregoryComer and others added 13 commits August 26, 2025 14:48
Job is failing on trunk. Temporarily disabling while I resolve it.
…#13574)

### Summary
This PR replaces an IR optimization that removes dead code from the
model, by an equivalent executorch call.

### Test plan
Unit test provided in `backends/nxp/tests/test_removing_dead_code.py`.


cc @digantdesai @JakeStevens @robert-kalmar
Differential Revision: D80772740

Pull Request resolved: pytorch#13592
Differential Revision: D80957382

Pull Request resolved: pytorch#13659
…ng to the released memory

Differential Revision: D80754181

Pull Request resolved: pytorch#13590
Differential Revision: D80906791

Pull Request resolved: pytorch#13632
Differential Revision: D80914321

Pull Request resolved: pytorch#13633
Differential Revision: D80881025

Pull Request resolved: pytorch#13623
…ch#13630)

Building the example application using cmake is now straight forward
enough to not need any helper scripts.

Additionally simplify the the example + path setup in the executor
runner cmake script and make the default path match the example.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Signed-off-by: Adrian Lundell <[email protected]>
### Summary
Add MobileNetV2 model as example and for integration testing

### Test plan
Support for testing full conversion on this model is included in
`run_aot_example.sh`.

---------

Co-authored-by: Lukas Sztefek <[email protected]>
Signed-off-by: Agrima Khare <[email protected]>

Change-Id: I032414e7454d5e2cada05b788e9eed0f7b2dc97c
Signed-off-by: Agrima Khare <[email protected]>

Change-Id: I032414e7454d5e2cada05b788e9eed0f7b2dc97c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.