Skip to content

Commit ff15175

Browse files
authored
Merge branch 'main' into export-D73381542
2 parents a43b47a + 7c150d4 commit ff15175

File tree

24 files changed

+257
-406
lines changed

24 files changed

+257
-406
lines changed

.github/workflows/doc-build.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,12 @@ jobs:
2121
- name: Check URLs
2222
run: bash ./scripts/check_urls.sh
2323

24-
check-links:
24+
check-xrefs:
2525
runs-on: ubuntu-latest
2626
steps:
2727
- uses: actions/checkout@v3
2828
- name: Check Links
29-
run: bash ./scripts/check_links.sh
29+
run: bash ./scripts/check_xrefs.sh
3030

3131
build:
3232
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main

backends/apple/mps/mps_preprocess.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from typing import ClassVar, Dict, final, List, Tuple
77

88
import torch
9+
from executorch import exir
910

1011
from executorch.backends.apple.mps.operators.node_visitor import (
1112
get_node_visitors,
@@ -35,6 +36,7 @@
3536

3637
from executorch.exir.passes.memory_format_ops_pass import DimOrderOpsRevertPass
3738
from executorch.exir.program._program import _transform
39+
from executorch.exir.verification.verifier import EXIREdgeDialectVerifier
3840
from torch.export.exported_program import ExportedProgram
3941

4042
FORMAT = "[%(levelname)s %(asctime)s %(filename)s:%(lineno)s] %(message)s"
@@ -87,7 +89,19 @@ def preprocess(
8789
# the `output_ids` array in the schema.
8890

8991
# TODO: Remove this once we have a better support for the dim-order ops.
90-
edge_program = _transform(edge_program, DimOrderOpsRevertPass())
92+
# Need to override the verifier to skip the non dim-order ops from tripping the default verifier.
93+
edge_program = _transform(
94+
edge_program,
95+
DimOrderOpsRevertPass(),
96+
override_verifiers=[
97+
EXIREdgeDialectVerifier(
98+
edge_compile_config=exir.EdgeCompileConfig(
99+
_check_ir_validity=False, # Disable the edge dialect verifier, since we are in the mps backend.
100+
),
101+
class_only=True,
102+
)
103+
],
104+
)
91105

92106
mps_graph = MPSGraph(
93107
version="0",

backends/openvino/README.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,9 @@ executorch
4040

4141
### Prerequisites
4242

43-
Before you begin, ensure you have openvino installed and configured on your system:
43+
Before you begin, ensure you have openvino installed and configured on your system.
44+
45+
### Build OpenVINO from Source
4446

4547
```bash
4648
git clone https://github.com/openvinotoolkit/openvino.git
@@ -56,7 +58,19 @@ cmake --install build --prefix <your_preferred_install_location>
5658
cd <your_preferred_install_location>
5759
source setupvars.sh
5860
```
59-
Note: The OpenVINO backend is not yet supported with the current OpenVINO release packages. It is recommended to build from source. The instructions for using OpenVINO release packages will be added soon.
61+
62+
### Use OpenVINO from Release Packages
63+
64+
1. Download the OpenVINO release package from [here](https://docs.openvino.ai/2025/get-started/install-openvino.html). Make sure to select your configuration and click on **OpenVINO Archives** under the distribution section to download the appropriate archive for your platform.
65+
66+
2. Extract the release package from the archive and set the environment variables.
67+
68+
```bash
69+
tar -zxf openvino_toolkit_<your_release_configuration>.tgz
70+
cd openvino_toolkit_<your_release_configuration>
71+
source setupvars.sh
72+
```
73+
6074
For more information about OpenVINO build, refer to the [OpenVINO Build Instructions](https://github.com/openvinotoolkit/openvino/blob/master/docs/dev/build_linux.md).
6175

6276
### Setup
@@ -78,7 +92,7 @@ Follow the steps below to setup your build environment:
7892
```bash
7993
./openvino_build.sh
8094
```
81-
**Build OpenVINO Backend Python Package with Pybindings**: To build and install the OpenVINO backend Python package with Python bindings, run the `openvino_build.sh` script with the `--enable_python` argument. This will compile and install the ExecuTorch Python package with the OpenVINO backend into your Python environment. This option will also enable python bindings required to execute OpenVINO backend tests and `export_and_infer_openvino.py` script inside `executorch/examples/openvino` folder.
95+
**Build OpenVINO Backend Python Package with Pybindings**: To build and install the OpenVINO backend Python package with Python bindings, run the `openvino_build.sh` script with the `--enable_python` argument. This will compile and install the ExecuTorch Python package with the OpenVINO backend into your Python environment. This option will also enable python bindings required to execute OpenVINO backend tests and `aot_optimize_and_infer.py` script inside `executorch/examples/openvino` folder.
8296

8397
```bash
8498
./openvino_build.sh --enable_python

docs/source/build-run-openvino.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ In this tutorial we will walk you through the process of setting up the prerequi
1111
:::{grid-item-card} Tutorials we recommend you complete before this:
1212
:class-card: card-prerequisites
1313
* [Introduction to ExecuTorch](intro-how-it-works.md)
14-
* [Setting up ExecuTorch](getting-started-setup.md)
15-
* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
14+
* [Setting up ExecuTorch](getting-started.md)
15+
* [Building ExecuTorch with CMake](using-executorch-building-from-source.md)
1616
:::
1717
::::
1818

examples/demo-apps/android/LlamaDemo/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -135,8 +135,8 @@ Ensure you have the following functions in your callback class that you provided
135135
}
136136

137137
@Override
138-
public void onStats(float tps) {
139-
//...tps (tokens per second) stats is provided by framework
138+
public void onStats(String stats) {
139+
//... will be a json. See extension/llm/stats.h for the field definitions
140140
}
141141

142142
```

examples/models/llama/export_llama_lib.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1227,10 +1227,22 @@ def _get_source_transforms( # noqa
12271227
if args.expand_rope_table:
12281228
transforms.append(materialze_broadcast_of_rope_freq_cis)
12291229

1230+
use_attention_mask_for_custom_sdpa = False
1231+
if isinstance(args, argparse.Namespace):
1232+
if getattr(args, "use_custom_sdpa_with_attention_mask", None):
1233+
use_attention_mask_for_custom_sdpa = True
1234+
12301235
if args.use_sdpa_with_kv_cache:
12311236
transforms.append(replace_kv_cache_with_custom_kv_cache)
12321237
# todo: do this optionally
1233-
transforms.append(replace_sdpa_with_custom_op)
1238+
# if use attention mask instead of causal attention
1239+
# then create partial function that sets use_attention_mask=True
1240+
if use_attention_mask_for_custom_sdpa:
1241+
transforms.append(
1242+
partial(replace_sdpa_with_custom_op, use_attention_mask=True)
1243+
)
1244+
else:
1245+
transforms.append(replace_sdpa_with_custom_op)
12341246

12351247
if args.quantize_kv_cache:
12361248
assert args.use_kv_cache, "quantize_kv_cache requires use_kv_cache=True"

examples/models/llama/source_transformation/sdpa.py

Lines changed: 50 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,15 @@ class SDPACustom(torch.nn.Module):
2222
def __init__(
2323
self,
2424
dim: int,
25+
max_context_len,
26+
enable_dynamic_shape,
27+
use_attention_mask: bool = False,
2528
):
2629
super().__init__()
2730
self.dim = dim
31+
self.max_context_len = max_context_len
32+
self.use_attention_mask = use_attention_mask
33+
self.enable_dynamic_shape = enable_dynamic_shape
2834

2935
def forward(
3036
self,
@@ -36,6 +42,16 @@ def forward(
3642
seqlen,
3743
mask,
3844
):
45+
if self.use_attention_mask:
46+
if self.enable_dynamic_shape:
47+
start_pos = input_pos[-1].item()
48+
torch._check_is_size(start_pos)
49+
torch._check(start_pos < self.max_context_len)
50+
seq_length = q.size(2)
51+
mask = mask.narrow(0, start_pos, seq_length)
52+
else:
53+
mask = mask[input_pos]
54+
3955
q = q.transpose(1, 2) # (bs, seqlen, n_local_heads, head_dim)
4056
k = k.transpose(1, 2)
4157
v = v.transpose(1, 2)
@@ -47,34 +63,54 @@ def forward(
4763
k = k.to(dtype=torch.float)
4864
v = v.to(dtype=torch.float)
4965

50-
output = torch.ops.llama.custom_sdpa(
51-
q,
52-
k,
53-
v,
54-
input_pos[0].item(),
55-
None, # Attention mask
56-
0, # dropout probability. Ignored by the code
57-
True, # is_causal
58-
)
66+
if self.use_attention_mask:
67+
output = torch.ops.llama.custom_sdpa(
68+
q,
69+
k,
70+
v,
71+
input_pos[0].item(),
72+
mask, # Attention mask
73+
0, # dropout probability. Ignored by the code
74+
False, # is_causal
75+
)
76+
else:
77+
output = torch.ops.llama.custom_sdpa(
78+
q,
79+
k,
80+
v,
81+
input_pos[0].item(),
82+
None, # Attention mask
83+
0, # dropout probability. Ignored by the code
84+
True, # is_causal
85+
)
5986
return output.view(bsz, seqlen, self.dim).to(dtype=input_dtype)
6087

6188

62-
def _replace_sdpa_with_custom_op(module: torch.nn.Module):
89+
def _replace_sdpa_with_custom_op(
90+
module: torch.nn.Module, use_attention_mask: bool = False
91+
):
6392
for name, child in module.named_children():
6493
if isinstance(child, SDPA):
6594
setattr(
6695
module,
6796
name,
68-
SDPACustom(child.dim),
97+
SDPACustom(
98+
child.dim,
99+
child.max_context_len,
100+
child.enable_dynamic_shape,
101+
use_attention_mask=use_attention_mask,
102+
),
69103
)
70104
else:
71-
_replace_sdpa_with_custom_op(child)
105+
_replace_sdpa_with_custom_op(child, use_attention_mask=use_attention_mask)
72106

73107

74-
def replace_sdpa_with_custom_op(module: torch.nn.Module) -> torch.nn.Module:
108+
def replace_sdpa_with_custom_op(
109+
module: torch.nn.Module, use_attention_mask: bool = False
110+
) -> torch.nn.Module:
75111
from executorch.extension.llm.custom_ops import custom_ops # noqa
76112

77-
_replace_sdpa_with_custom_op(module)
113+
_replace_sdpa_with_custom_op(module, use_attention_mask=use_attention_mask)
78114
return module
79115

80116

examples/models/llama/source_transformation/test_sdpa_with_quantized_kv_cache.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,8 @@ def test_simple(self, is_dynamic_shape=False):
7171
self.seq_len = 3
7272
self._init_cache()
7373
q, k_val, v_val = self._init_kv()
74-
self.float_sdpa = SDPACustom(self.dim)
75-
self.quantized_sdpa = SDPACustom(self.dim)
74+
self.float_sdpa = SDPACustom(self.dim, self.max_context_len, True)
75+
self.quantized_sdpa = SDPACustom(self.dim, self.max_context_len, True)
7676
k, v = self.custom_kv_cache.update(input_pos, k_val, v_val)
7777
float_out = self.float_sdpa(input_pos, q, k, v, 1, self.seq_len, None)
7878
k, v = self.quantized_kv_cache.update(input_pos, k_val, v_val)

examples/qualcomm/scripts/mobilebert_fine_tune.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,9 +102,7 @@ def get_fine_tuned_mobilebert(artifacts_dir, pretrained_weight, batch_size):
102102
from transformers import get_linear_schedule_with_warmup
103103

104104
# grab dataset
105-
url = (
106-
"https://raw.githubusercontent.com/susanli2016/NLP-with-Python/master/data/title_conference.csv"
107-
)
105+
url = "https://raw.githubusercontent.com/susanli2016/NLP-with-Python/master/data/title_conference.csv"
108106
content = requests.get(url, allow_redirects=True).content
109107
data = pd.read_csv(BytesIO(content))
110108

exir/program/_program.py

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,30 @@ def _get_updated_graph_signature(
212212
return new_signature
213213

214214

215-
def _transform(self, *passes: PassType) -> "ExportedProgram":
215+
def _transform(
216+
self,
217+
*passes: PassType,
218+
override_verifiers: None | list[Type[Verifier]] = None,
219+
) -> "ExportedProgram":
220+
"""
221+
Transforms the program according to the provided passes.
222+
223+
Args:
224+
self: The ExportedProgram instance to transform
225+
*passes: A sequence of passes to apply to the program
226+
override_verifiers: Optional list of verifier classes to use instead of the default verifiers.
227+
This is needed if the transforms yields illegal graph that the default verifier cannot handle.
228+
229+
Returns:
230+
ExportedProgram: A new ExportedProgram with the transformations applied, or self if no changes were made
231+
"""
232+
# A user friendly check to avoid vararg surprises, PEP 3102
233+
assert not any(
234+
isinstance(p, (list, Verifier)) for p in passes
235+
), f"Expected all passes to be of PassType, not list or Verifier. Use override_verifiers kwarg instead. Got: {list(passes)}"
236+
237+
for p in list(passes):
238+
print(type(p))
216239
pm = PassManager(list(passes))
217240
res = pm(self.graph_module)
218241
transformed_gm = res.graph_module if res is not None else self.graph_module
@@ -221,7 +244,9 @@ def _transform(self, *passes: PassType) -> "ExportedProgram":
221244
if transformed_gm is self.graph_module and not res.modified:
222245
return self
223246

224-
return _update_exported_program_graph_module(self, transformed_gm)
247+
return _update_exported_program_graph_module(
248+
self, transformed_gm, override_verifiers
249+
)
225250

226251

227252
def _update_exported_program_graph_module(

0 commit comments

Comments
 (0)