pytorch · lucylq · Jul 23, 2025 · Jul 23, 2025
@@ -43,6 +43,11 @@ Creating a `Module` object is a fast operation that does not involve significant
 Module module("/path/to/model.pte");
 ```
 
+For a model with data separated into a PTD file, load them together:
+```cpp
+Module module("/path/to/model.pte", "/path/to/model.ptd");
+```
+
 ### Force-Loading a Method
 
 To force-load the `Module` (and thus the underlying ExecuTorch `Program`) at any time, use the `load()` function:

@@ -1,6 +1,6 @@
 # Model Export and Lowering
 
-The section describes the process of taking a PyTorch model and converting to the runtime format used by ExecuTorch. This process is commonly known as "exporting", as it uses the PyTorch export functionality to convert a PyTorch model into a format suitable for on-device execution. This process yields a .pte file which is optimized for on-device execution using a particular backend.
+The section describes the process of taking a PyTorch model and converting to the runtime format used by ExecuTorch. This process is commonly known as "exporting", as it uses the PyTorch export functionality to convert a PyTorch model into a format suitable for on-device execution. This process yields a .pte file which is optimized for on-device execution using a particular backend. If using program-data separation, it also yields a corresponding .ptd file containing only the weights/constants from the model.
 
 ## Prerequisites
 
@@ -30,7 +30,7 @@ As part of the .pte file creation process, ExecuTorch identifies portions of the
 
 ### Available Backends
 
-Commonly used hardware backends are listed below. For mobile, consider using XNNPACK for Android and XNNPACK or Core ML for iOS. To create a .pte file for a specific backend, pass the appropriate partitioner class to `to_edge_transform_and_lower`. See the appropriate backend documentation and the [Export and Lowering](#export-and-lowering) section below for more information. 
+Commonly used hardware backends are listed below. For mobile, consider using XNNPACK for Android and XNNPACK or Core ML for iOS. To create a .pte file for a specific backend, pass the appropriate partitioner class to `to_edge_transform_and_lower`. See the appropriate backend documentation and the [Export and Lowering](#export-and-lowering) section below for more information.
 
 - [XNNPACK (Mobile CPU)](backends-xnnpack.md)
 - [Core ML (iOS)](backends-coreml.md)
@@ -61,7 +61,7 @@ class Model(torch.nn.Module):
             torch.nn.AdaptiveAvgPool2d((1,1))
        )
         self.linear = torch.nn.Linear(16, 10)
-    
+
     def forward(self, x):
         y = self.seq(x)
         y = torch.flatten(y, 1)
@@ -97,7 +97,7 @@ class Model(torch.nn.Module):
             torch.nn.AdaptiveAvgPool2d((1,1))
         )
         self.linear = torch.nn.Linear(16, 10)
-    
+
     def forward(self, x):
         y = self.seq(x)
         y = torch.flatten(y, 1)
@@ -125,6 +125,31 @@ with open("model.pte", "wb") as file:
 
 This yields a `model.pte` file which can be run on mobile devices.
 
+To generate a `model.pte`, `model.ptd` pair with the weights inside `model.ptd`, add the following transform function to tag constants as external:
+
+```python
+from executorch.exir.passes.external_constants_pass import (
+    delegate_external_constants_pass,
+)
+partial_function = partial(
+    delegate_external_constants_pass,
+    ep=exported_program,
+    gen_tag_fn=lambda x: "model", # This is the filename the weights will be saved to. In this case, weights will be saved as "model.ptd"
+)
+
+executorch_program = to_edge_transform_and_lower(
+    exported_program,
+    transform_passes = [partial_function],
+    partitioner = [XnnpackPartitioner()]
+).to_executorch()
+```
+
+To save the PTD file:
+```
+executorch_program.write_tensor_data_to_file(output_directory)
+```
+It will be saved to the file `model.ptd`, with the file name coming from `gen_tag_fn` in the transform pass.
+
 ### Supporting Varying Input Sizes (Dynamic Shapes)
 
 The PyTorch export process uses the example inputs provided to trace through the model and reason about the size and type of tensors at each step. Unless told otherwise, export will assume a fixed input size equal to the example inputs and will use this information to optimize the model.
@@ -167,6 +192,8 @@ method = program.load_method("forward")
 outputs = method.execute([input_tensor])
 ```
 
+Pybindings currently does not support loading program and data. To run a model with PTE and PTD components, please use the [Extension Module](extension-module.md). There is also an E2E demo in [executorch-examples](https://github.com/pytorch-labs/executorch-examples/tree/main/program-data-separation).
+
 For more information, see [Runtime API Reference](executorch-runtime-api-reference.md).
 
 ## Advanced Topics
@@ -227,7 +254,7 @@ class EncodeWrapper(torch.nn.Module):
     def __init__(self, model):
         super().__init__()
         self.model = model
-    
+
     def forward(self, *args, **kwargs):
         return self.model.encode(*args, **kwargs)