Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/source/extension-module.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,11 @@ Creating a `Module` object is a fast operation that does not involve significant
Module module("/path/to/model.pte");
```

For a model with data separated into a PTD file, load them together:
```cpp
Module module("/path/to/model.pte", "/path/to/model.ptd");
```
### Force-Loading a Method
To force-load the `Module` (and thus the underlying ExecuTorch `Program`) at any time, use the `load()` function:
Expand Down
37 changes: 32 additions & 5 deletions docs/source/using-executorch-export.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Model Export and Lowering

The section describes the process of taking a PyTorch model and converting to the runtime format used by ExecuTorch. This process is commonly known as "exporting", as it uses the PyTorch export functionality to convert a PyTorch model into a format suitable for on-device execution. This process yields a .pte file which is optimized for on-device execution using a particular backend.
The section describes the process of taking a PyTorch model and converting to the runtime format used by ExecuTorch. This process is commonly known as "exporting", as it uses the PyTorch export functionality to convert a PyTorch model into a format suitable for on-device execution. This process yields a .pte file which is optimized for on-device execution using a particular backend. If using program-data separation, it also yields a corresponding .ptd file containing only the weights/constants from the model.

## Prerequisites

Expand Down Expand Up @@ -30,7 +30,7 @@ As part of the .pte file creation process, ExecuTorch identifies portions of the

### Available Backends

Commonly used hardware backends are listed below. For mobile, consider using XNNPACK for Android and XNNPACK or Core ML for iOS. To create a .pte file for a specific backend, pass the appropriate partitioner class to `to_edge_transform_and_lower`. See the appropriate backend documentation and the [Export and Lowering](#export-and-lowering) section below for more information.
Commonly used hardware backends are listed below. For mobile, consider using XNNPACK for Android and XNNPACK or Core ML for iOS. To create a .pte file for a specific backend, pass the appropriate partitioner class to `to_edge_transform_and_lower`. See the appropriate backend documentation and the [Export and Lowering](#export-and-lowering) section below for more information.

- [XNNPACK (Mobile CPU)](backends-xnnpack.md)
- [Core ML (iOS)](backends-coreml.md)
Expand Down Expand Up @@ -61,7 +61,7 @@ class Model(torch.nn.Module):
torch.nn.AdaptiveAvgPool2d((1,1))
)
self.linear = torch.nn.Linear(16, 10)

def forward(self, x):
y = self.seq(x)
y = torch.flatten(y, 1)
Expand Down Expand Up @@ -97,7 +97,7 @@ class Model(torch.nn.Module):
torch.nn.AdaptiveAvgPool2d((1,1))
)
self.linear = torch.nn.Linear(16, 10)

def forward(self, x):
y = self.seq(x)
y = torch.flatten(y, 1)
Expand Down Expand Up @@ -125,6 +125,31 @@ with open("model.pte", "wb") as file:

This yields a `model.pte` file which can be run on mobile devices.

To generate a `model.pte`, `model.ptd` pair with the weights inside `model.ptd`, add the following transform function to tag constants as external:

```python
from executorch.exir.passes.external_constants_pass import (
delegate_external_constants_pass,
)
partial_function = partial(
delegate_external_constants_pass,
ep=exported_program,
gen_tag_fn=lambda x: "model", # This is the filename the weights will be saved to. In this case, weights will be saved as "model.ptd"
)

executorch_program = to_edge_transform_and_lower(
exported_program,
transform_passes = [partial_function],
partitioner = [XnnpackPartitioner()]
).to_executorch()
```

To save the PTD file:
```
executorch_program.write_tensor_data_to_file(output_directory)
```
It will be saved to the file `model.ptd`, with the file name coming from `gen_tag_fn` in the transform pass.

### Supporting Varying Input Sizes (Dynamic Shapes)

The PyTorch export process uses the example inputs provided to trace through the model and reason about the size and type of tensors at each step. Unless told otherwise, export will assume a fixed input size equal to the example inputs and will use this information to optimize the model.
Expand Down Expand Up @@ -167,6 +192,8 @@ method = program.load_method("forward")
outputs = method.execute([input_tensor])
```

Pybindings currently does not support loading program and data. To run a model with PTE and PTD components, please use the [Extension Module](extension-module.md). There is also an E2E demo in [executorch-examples](https://github.com/pytorch-labs/executorch-examples/tree/main/program-data-separation).

For more information, see [Runtime API Reference](executorch-runtime-api-reference.md).

## Advanced Topics
Expand Down Expand Up @@ -227,7 +254,7 @@ class EncodeWrapper(torch.nn.Module):
def __init__(self, model):
super().__init__()
self.model = model

def forward(self, *args, **kwargs):
return self.model.encode(*args, **kwargs)

Expand Down
Loading