Skip to content

Commit 793efb5

Browse files
committed
Update
[ghstack-poisoned]
2 parents ba4293b + 39fbd68 commit 793efb5

File tree

168 files changed

+2524
-869
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

168 files changed

+2524
-869
lines changed

CONTRIBUTING.md

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
Thank you for your interest in contributing to ExecuTorch! We want to make
22
it easy to contribute to this project.
33

4-
 
54

65
## Dev Install
76

@@ -91,7 +90,7 @@ executorch
9190
│ └── <a href="runtime/platform">platform</a> - Layer between architecture specific code and portable C++.
9291
├── <a href="schema">schema</a> - ExecuTorch PTE file format flatbuffer schemas.
9392
├── <a href="scripts">scripts</a> - Utility scripts for building libs, size management, dependency management, etc.
94-
├── <a href="shim">shim</a> - Compatibility layer between OSS and Internal builds.
93+
├── <a href="shim_et">shim_et</a> - Compatibility layer between OSS and Internal builds.
9594
├── <a href="test">test</a> - Broad scoped end-to-end tests.
9695
├── <a href="third-party">third-party</a> - Third-party dependencies.
9796
├── <a href="tools">tools</a> - Tools for building ExecuTorch from source, for different built tools (CMake, Buck).
@@ -192,9 +191,6 @@ in the Github repo.
192191

193192
## Coding Style
194193

195-
Goal: Encourage standards that make it easier to read, edit, maintain, and debug
196-
the ExecuTorch code.
197-
198194
### lintrunner
199195

200196
We use [`lintrunner`](https://pypi.org/project/lintrunner/) to help make sure the
@@ -259,7 +255,7 @@ toolchains, and having access to relatively modern C++ features.
259255

260256
#### C/C++ standard library usage
261257

262-
**Restricted usage of the C++ standard library.**
258+
**Restricted usage of the C++ standard library**
263259

264260
Rationale: ExecuTorch is intended to be portable to bare-metal systems that lack
265261
certain features, like dynamic memory, threading, and locking, required by parts
@@ -280,7 +276,7 @@ careful to also manually destroy objects initialized in this way.
280276

281277
#### C++ language features
282278

283-
**Exceptions: Do not use.**
279+
**Exceptions: Do not use**
284280
- Rationale: Exceptions are not widely supported on some classes of
285281
microcontrollers and DSPs, and they can significantly increase binary size.
286282

@@ -289,12 +285,12 @@ must work with threading**
289285
- Rationale: The core runtime must work on systems that do not have threading
290286
support.
291287

292-
**RTTI, dynamic_cast, and `<typeid>`: Do not use.**
288+
**RTTI, dynamic_cast, and `<typeid>`: Do not use**
293289
- Rationale: RTTI adds extra data to every virtual class. ExecuTorch doesn't
294290
have a strong need for `dynamic_cast` and friends, so it's better to reduce
295291
the binary size.
296292

297-
**Templates and template metaprogramming: Be careful and avoid if possible.**
293+
**Templates and template metaprogramming: Be careful and avoid if possible**
298294
- Rationale: Most templating results in code generation, and is one of the most
299295
common sources of binary bloat. Some use of templates is fine (e.g. an
300296
`ArrayRef<T>`, or code that handles multiple `ScalarType` types), but for the
@@ -359,7 +355,7 @@ docs](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/
359355
for basics.
360356

361357
1. Push your branch to your fork of `pytorch/executorch`. Most people do not
362-
have permission to push a branch directoy to the upstream repo.
358+
have permission to push a branch directory to the upstream repo.
363359
1. Create your PR
364360
- Use the `main` branch as the base.
365361
- Give the PR a clear and descriptive title. It will become the title of the

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,9 +49,9 @@ Key value propositions of ExecuTorch are:
4949
## Getting Started
5050
To get started you can:
5151

52-
- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/main/index.html) on getting things running locally and deploy a model to a device
52+
- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/main/index.html) to get things running locally and deploy a model to a device
5353
- Use this [Colab Notebook](https://pytorch.org/executorch/stable/getting-started-setup.html#quick-setup-colab-jupyter-notebook-prototype) to start playing around right away
54-
- Jump straight into LLMs use cases by following specific instructions for [Llama](./examples/models/llama/README.md) and [Llava](./examples/models/llava/README.md)
54+
- Jump straight into LLM use cases by following specific instructions for [Llama](./examples/models/llama/README.md) and [Llava](./examples/models/llava/README.md)
5555

5656
## Feedback and Engagement
5757

backends/arm/test/models/test_llama.py

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ def prepare_model(self):
102102
def test_llama_tosa_MI(self):
103103
llama_model, llama_inputs, llama_meta = self.prepare_model()
104104

105-
if llama_model is None and llama_inputs is None and llama_meta is None:
105+
if llama_model is None or llama_inputs is None:
106106
pytest.skip("Missing model and/or input files")
107107

108108
with torch.no_grad():
@@ -123,3 +123,29 @@ def test_llama_tosa_MI(self):
123123
rtol=1.1, # TODO: MLETORCH-825 decrease tolerance
124124
)
125125
)
126+
127+
@pytest.mark.xfail(reason="KeyError: scalar_tensor_1 (MLETORCH-907)")
128+
def test_llama_tosa_BI(self):
129+
llama_model, llama_inputs, llama_meta = self.prepare_model()
130+
131+
if llama_model is None or llama_inputs is None:
132+
pytest.skip("Missing model and/or input files")
133+
134+
with torch.no_grad():
135+
(
136+
ArmTester(
137+
llama_model,
138+
example_inputs=llama_inputs,
139+
compile_spec=common.get_tosa_compile_spec("TOSA-0.80+BI"),
140+
constant_methods=llama_meta,
141+
)
142+
.quantize()
143+
.export()
144+
.to_edge_transform_and_lower()
145+
.to_executorch()
146+
.run_method_and_compare_outputs(
147+
inputs=llama_inputs,
148+
atol=4.3,
149+
rtol=1.1, # TODO: Tolerance needs to be updated after MLETORCH-907
150+
)
151+
)

backends/arm/test/models/test_mobilenet_v3_arm.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ def test_mv3_tosa_BI():
4646
aten_op=[],
4747
exir_op=[],
4848
use_to_edge_transform_and_lower=True,
49-
atol=0.3,
49+
atol=0.5,
5050
qtol=1,
5151
)
5252
pipeline.run()
@@ -63,7 +63,7 @@ def test_mv3_u55_BI():
6363
exir_ops=[],
6464
run_on_fvp=True,
6565
use_to_edge_transform_and_lower=True,
66-
atol=0.3,
66+
atol=0.5,
6767
qtol=1,
6868
)
6969
pipeline.run()
@@ -80,7 +80,7 @@ def test_mv3_u85_BI():
8080
exir_ops=[],
8181
run_on_fvp=True,
8282
use_to_edge_transform_and_lower=True,
83-
atol=0.3,
83+
atol=0.5,
8484
qtol=1,
8585
)
8686
pipeline.run()

backends/arm/test/models/test_torch_functions.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ def forward(self, *args):
101101
"Requires dynamic output shape.",
102102
"topk": "NotImplementedError: No registered serialization name for <class 'torch.return_types.topk'> found",
103103
"sort": "NotImplementedError: No registered serialization name for <class 'torch.return_types.sort'> found",
104+
"norm": "An error occurred when running the 'KeepDimsFalseToSqueezePass' pass after the following passes:",
104105
},
105106
)
106107
def test_torch_fns_MI(test_data):
@@ -129,6 +130,7 @@ def test_torch_fns_MI(test_data):
129130
"topk": "NotImplementedError: No registered serialization name for <class 'torch.return_types.topk'> found",
130131
"sort": "NotImplementedError: No registered serialization name for <class 'torch.return_types.sort'> found",
131132
"t": "MLETORCH-855: Issue with Quantization folding.",
133+
"norm": "An error occurred when running the 'KeepDimsFalseToSqueezePass' pass after the following passes:",
132134
},
133135
strict=False,
134136
)

backends/arm/test/ops/test_sigmoid_16bit.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ def forward(self, x):
8181

8282

8383
@common.parametrize("test_data", test_data_suite)
84-
@pytest.mark.flaky(reruns=5)
84+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
8585
def test_sigmoid_tosa_BI(test_data):
8686
pipeline = TosaPipelineBI(
8787
Sigmoid(), (test_data(),), Sigmoid.aten_op, Sigmoid.exir_op
@@ -97,7 +97,7 @@ def test_sigmoid_tosa_BI(test_data):
9797
"ramp": "AssertionError: Output 0 does not match reference output. MLETORCH-787"
9898
},
9999
)
100-
@pytest.mark.flaky(reruns=5)
100+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
101101
def test_sigmoid_add_sigmoid_tosa_BI(test_data):
102102
pipeline = TosaPipelineBI(
103103
SigmoidAddSigmoid(), (test_data(),), Sigmoid.aten_op, Sigmoid.exir_op
@@ -110,6 +110,7 @@ def test_sigmoid_add_sigmoid_tosa_BI(test_data):
110110
"test_data",
111111
test_data_suite,
112112
)
113+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
113114
def test_sigmoid_tosa_u55(test_data):
114115
pipeline = OpNotSupportedPipeline(
115116
Sigmoid(), (test_data(),), "TOSA-0.80+BI+u55", {Sigmoid.exir_op: 1}
@@ -122,6 +123,7 @@ def test_sigmoid_tosa_u55(test_data):
122123
"test_data",
123124
test_data_suite,
124125
)
126+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
125127
def test_sigmoid_add_sigmoid_tosa_u55(test_data):
126128
pipeline = OpNotSupportedPipeline(
127129
SigmoidAddSigmoid(),
@@ -135,7 +137,7 @@ def test_sigmoid_add_sigmoid_tosa_u55(test_data):
135137

136138

137139
@common.parametrize("test_data", test_data_suite)
138-
@pytest.mark.flaky(reruns=5)
140+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
139141
@common.XfailIfNoCorstone320
140142
def test_sigmoid_tosa_u85(test_data):
141143
pipeline = EthosU85PipelineBI(
@@ -152,7 +154,7 @@ def test_sigmoid_tosa_u85(test_data):
152154
"ramp": "AssertionError: Output 0 does not match reference output.",
153155
},
154156
)
155-
@pytest.mark.flaky(reruns=5)
157+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
156158
@common.XfailIfNoCorstone320
157159
def test_sigmoid_add_sigmoid_tosa_u85(test_data):
158160
pipeline = EthosU85PipelineBI(

backends/arm/test/ops/test_sigmoid_32bit.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ def forward(self, x):
9797

9898

9999
@common.parametrize("test_data", test_data_suite)
100-
@pytest.mark.flaky(reruns=5)
100+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
101101
def test_sigmoid_tosa_BI(test_data):
102102
pipeline = TosaPipelineBI(
103103
Sigmoid(),
@@ -110,7 +110,7 @@ def test_sigmoid_tosa_BI(test_data):
110110

111111

112112
@common.parametrize("test_data", test_data_suite)
113-
@pytest.mark.flaky(reruns=5)
113+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
114114
def test_sigmoid_add_sigmoid_tosa_BI(test_data):
115115
pipeline = TosaPipelineBI(
116116
SigmoidAddSigmoid(),
@@ -123,6 +123,7 @@ def test_sigmoid_add_sigmoid_tosa_BI(test_data):
123123

124124

125125
@common.parametrize("test_data", test_data_suite)
126+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
126127
def test_sigmoid_tosa_u55(test_data):
127128
pipeline = OpNotSupportedPipeline(
128129
Sigmoid(), (test_data(),), "TOSA-0.80+BI+u55", {Sigmoid.exir_op: 1}
@@ -132,6 +133,7 @@ def test_sigmoid_tosa_u55(test_data):
132133

133134

134135
@common.parametrize("test_data", test_data_suite)
136+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
135137
def test_sigmoid_add_sigmoid_tosa_u55(test_data):
136138
pipeline = OpNotSupportedPipeline(
137139
SigmoidAddSigmoid(),
@@ -145,7 +147,7 @@ def test_sigmoid_add_sigmoid_tosa_u55(test_data):
145147

146148

147149
@common.parametrize("test_data", test_data_suite)
148-
@pytest.mark.flaky(reruns=5)
150+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
149151
@common.XfailIfNoCorstone320
150152
def test_sigmoid_tosa_u85(test_data):
151153
pipeline = EthosU85PipelineBI(
@@ -162,7 +164,7 @@ def test_sigmoid_tosa_u85(test_data):
162164
"ramp": "AssertionError: Output 0 does not match reference output.",
163165
},
164166
)
165-
@pytest.mark.flaky(reruns=5)
167+
@pytest.mark.flaky(reruns=32) # Flaky due to Vela bug: MLBEDSW-10642
166168
@common.XfailIfNoCorstone320
167169
def test_sigmoid_add_sigmoid_tosa_u85(test_data):
168170
pipeline = EthosU85PipelineBI(

backends/cadence/aot/memory_planning.py

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
import math
1313
import typing
1414
from functools import partial
15-
from typing import Iterable, List, Optional, Tuple
15+
from typing import Iterable, List, Optional, Set, Tuple
1616

1717
import torch
1818
from executorch.backends.cadence.aot.memory_constraints import (
@@ -73,11 +73,11 @@ def collect_specs_from_graph_module(
7373
# the fastest memory available
7474
# flake8: noqa 'position_based_greedy_with_hierarchy' is too complex (13)
7575
def position_based_greedy_with_hierarchy(
76-
graph_module: torch.fx.GraphModule,
7776
alignment: int,
77+
specs: Set[TensorSpec],
78+
graph_module: torch.fx.GraphModule,
7879
graph_signature: ExportGraphSignature,
79-
alloc_graph_input: bool,
80-
alloc_graph_output: bool,
80+
extra_padding: int = 0,
8181
*,
8282
memory_config: MemoryConfig,
8383
mem_constraints: MemConstraints,
@@ -119,9 +119,7 @@ def memory_available(spec: TensorSpec) -> bool:
119119

120120
# Iterate over all the specs in sorted order
121121
for spec in sorted(
122-
collect_specs_from_graph_module(
123-
graph_module, graph_signature, alloc_graph_input, alloc_graph_output
124-
),
122+
specs,
125123
key=lambda spec: spec.allocated_memory,
126124
reverse=True,
127125
):
@@ -167,11 +165,11 @@ def memory_available(spec: TensorSpec) -> bool:
167165

168166
# Greedy tensor placement with the heuristics from arxiv.org/pdf/2001.03288.pdf
169167
def greedy_by_size_for_offset_calculation_with_hierarchy(
170-
graph_module: torch.fx.GraphModule,
171168
alignment: int,
169+
specs: Set[TensorSpec],
170+
graph_module: torch.fx.GraphModule,
172171
graph_signature: ExportGraphSignature,
173-
alloc_graph_input: bool,
174-
alloc_graph_output: bool,
172+
extra_padding: int = 0,
175173
*,
176174
memory_config: MemoryConfig,
177175
mem_constraints: MemConstraints,
@@ -199,9 +197,7 @@ def greedy_by_size_for_offset_calculation_with_hierarchy(
199197

200198
# Iterate over all the specs in sorted order
201199
for spec in sorted(
202-
collect_specs_from_graph_module(
203-
graph_module, graph_signature, alloc_graph_input, alloc_graph_output
204-
),
200+
specs,
205201
key=lambda spec: spec.allocated_memory,
206202
reverse=True,
207203
):

backends/qualcomm/_passes/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
from .annotate_unbind import AnnotateUnbind
1010
from .convert_bmm_to_matmul import ConvertBmmToMatmul
1111
from .convert_conv1d_to_conv2d import ConvertConv1dToConv2d
12+
from .convert_upsample_bicubic2d import ConvertUpsampleBicubicWithBilinear
1213
from .decompose_any import DecomposeAny
1314
from .decompose_einsum import DecomposeEinsum
1415
from .decompose_expm1 import DecomposeExpM1
@@ -40,6 +41,7 @@
4041
ConvertBmmToMatmul,
4142
ConvertConv1dToConv2d,
4243
DecomposeAny,
44+
ConvertUpsampleBicubicWithBilinear,
4345
DecomposeEinsum,
4446
DecomposeExpM1,
4547
DecomposeLinalgVectorNorm,
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Copyright (c) Qualcomm Innovation Center, Inc.
2+
# All rights reserved
3+
#
4+
# This source code is licensed under the BSD-style license found in the
5+
# LICENSE file in the root directory of this source tree.
6+
from executorch.exir.dialects._ops import ops as exir_ops
7+
from executorch.exir.pass_base import ExportPass
8+
9+
10+
class ConvertUpsampleBicubicWithBilinear(ExportPass):
11+
"""
12+
Qnn does not support bicubic interpolation, so we need to convert it to bilinear.
13+
This pass will convert bicubic interpolation to bilinear interpolation.
14+
"""
15+
16+
bicubic_op_targets = {
17+
exir_ops.edge.aten.upsample_bicubic2d.vec,
18+
}
19+
upsample_bilinear_op = exir_ops.edge.aten.upsample_bilinear2d.default
20+
21+
def __init__(self):
22+
super(ConvertUpsampleBicubicWithBilinear, self).__init__()
23+
24+
def call_operator(self, op, args, kwargs, meta):
25+
if op not in self.bicubic_op_targets:
26+
return super().call_operator(op, args, kwargs, meta)
27+
return super().call_operator(self.upsample_bilinear_op, args[:-1], kwargs, meta)

0 commit comments

Comments
 (0)