From 94e736fe007c6b398e6c473e1322f3c7e4200eba Mon Sep 17 00:00:00 2001
From: lucylq <lfq@meta.com>
Date: Fri, 11 Jul 2025 11:08:05 -0700
Subject: [PATCH] Documentation for the PTD file format

Differential Revision: D78019434

Pull Request resolved: https://github.com/pytorch/executorch/pull/12316

(cherry picked from commit d506312f8335855b2264bc5388b84595d0b0b225)
---
 docs/source/index.md            |   2 +
 docs/source/ptd-file-format.md  | 144 ++++++++++++++++++++++++++++++++
 extension/flat_tensor/README.md |  59 ++++++++++++-
 3 files changed, 202 insertions(+), 3 deletions(-)
 create mode 100644 docs/source/ptd-file-format.md

diff --git a/docs/source/index.md b/docs/source/index.md
index b9ce82b234c..5f114d547ac 100644
--- a/docs/source/index.md
+++ b/docs/source/index.md
@@ -75,6 +75,7 @@ ExecuTorch provides support for:
 - [Platform Abstraction Layer](runtime-platform-abstraction-layer)
 #### Portable C++ Programming
 - [PTE File Format](pte-file-format)
+- [PTD File Format](ptd-file-format)
 #### API Reference
 - [Export to Executorch API Reference](export-to-executorch-api-reference)
 - [Executorch Runtime API Reference](executorch-runtime-api-reference)
@@ -196,6 +197,7 @@ runtime-backend-delegate-implementation-and-linking
 runtime-platform-abstraction-layer
 portable-cpp-programming
 pte-file-format
+ptd-file-format
 ```
 
 ```{toctree}
diff --git a/docs/source/ptd-file-format.md b/docs/source/ptd-file-format.md
new file mode 100644
index 00000000000..6381e8a071c
--- /dev/null
+++ b/docs/source/ptd-file-format.md
@@ -0,0 +1,144 @@
+# `.ptd` file format
+
+ExecuTorch `.ptd` files are serialized as modified binary flatbuffer
+files with data segments appended. They provide a way to store named data using
+the FlatTensor format. Named data can be tensors or opaque blob data (usually for backends that do not expose data format).
+
+Code related to the PTD file format is in the `//executorch/extension/flat_tensor/` directory.
+
+```
+             ┌───────────────────────────────────┐
+             │Standard flatbuffer header         │
+             ├───────────────────────────────────┤
+             │ExecuTorch extended header         │
+             ├───────────────────────────────────┤
+             │Flatbuffer-serialized metadata     │
+             │(FlatTensor)                       │
+             │                                   │
+          ┌─ ├───────────────────────────────────┤
+          │  │Padding                            │
+          │  ├───────────────────────────────────┤
+          │  │Data segment                       │
+          │  │                                   │
+          │  │                                   │
+          │  ├───────────────────────────────────┤
+          │  │Padding                            │
+   Blobs ─┤  ├───────────────────────────────────┤
+          │  │Data segment                       │
+          │  │                                   │
+          │  │                                   │
+          │  ├───────────────────────────────────┤
+          │  │Padding                            │
+          │  ├───────────────────────────────────┤
+          │  │...                                │
+          └─ └───────────────────────────────────┘
+```
+
+## Compatibility
+
+PTD files are designed for storing named data that can be loaded by ExecuTorch
+models.
+
+## Headers
+
+PTD files can be recognized by the magic string at byte offset 4, beginning with `FT`
+and followed by two ASCII decimal digits (file identifier from the FlatBuffers schema).
+
+PTD files have an extended header at byte offset 8, recognized by the magic string
+`FH01`. This header includes the size and offset information for both the
+flatbuffer-serialized metadata and the data segments that follow.
+
+Note that this header is ExecuTorch-specific, but even when present it does not
+upset most flatbuffer-parsing code (apart from the rarely-used
+`GetBufferStartFromRootPointer()`).
+
+All numbers are little-endian, regardless of the host system.
+
+Header layout:
+```
+[0..3] uint32_t byte offset to the beginning of the flatbuffer root table.
+[4..7] File magic bytes: "FT" followed by two ASCII decimal digits. The digits
+       correspond to the FlatBuffers file identifier.
+Extended header (always present):
+|  [8..11] Extended header magic bytes: "FH01" - FlatTensor Header version 01.
+| [12..15] uint32_t size of this extended header in bytes, including the magic
+|          header and this size field. Currently fixed at 40 bytes.
+| [16..23] uint64_t offset (from byte offset zero) to the start of the
+|          flatbuffer data.
+| [24..31] uint64_t size of the flatbuffer-encoded tensor metadata in bytes.
+| [32..39] uint64_t offset (from byte offset zero) to the start of the first
+|          data segment.
+| [40..47] uint64_t total size of all data segments in bytes.
+End of extended header.
+```
+
+Example:
+```
+        Offset to flatbuffer root (0x44)
+        |            File magic ("FT01")
+        |            |            Extended header magic ("FH01")
+        |            |            |            Extended header size (0x28)
+        vvvvvvvvvvv  vvvvvvvvvvv  vvvvvvvvvvv  vvvvvvvvvvv
+0x0000  44 00 00 00  46 54 30 31  46 48 30 31  28 00 00 00
+0x0010  30 00 00 00  00 00 00 00  00 01 00 00  00 00 00 00
+0x0020  30 01 00 00  00 00 00 00  20 00 00 00  00 00 00 00
+        ^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^
+        |                         | Flatbuffer size (0x100)
+        |                         | Segment data size (0x20)
+        Segment base offset (0x130)
+```
+Note: this example comes from inspecting the ModuleAddMul.ptd file.
+```
+python -m test.models.export_program --modules "ModuleAddMul" --external-constants --outdir .
+
+xxd -l 64 ModuleAddMulProgram.ptd
+```
+
+## FlatTensor
+
+See `//executorch/extension/flat_tensor/serialize/flat_tensor.fbs` for the
+FlatTensor flatbuffer schema.
+
+The flatbuffer-encoded metadata follows the headers and contains:
+
+- **Schema version**: Version information for compatibility.
+- **Data segments**: List of segment descriptors with offset and size information.
+- **Named data**: List of named data entries, each containing:
+  - **Key**: String identifier for the data blob.
+  - **Segment index**: Reference to the data segment containing the blob.
+  - **Tensor layout**: Optional metadata including scalar type, sizes and dim order, if the data segment contains a tensor.
+
+### Tensor Layout
+
+If a data segment contains a canonical tensor, it may have associated layout information:
+- **Scalar type**: Data type (float32, int32, etc.) using ExecutorTorch scalar types.
+- **Sizes**: Dimensions of the tensor.
+- **Dim order**: Memory layout order specifying how dimensions are arranged in memory.
+
+## Data segments
+
+The `FlatTensor.segments` list in the metadata contains offset and size
+information about each data segment. Offsets in this list are relative to
+the segment base offset specified in the extended header.
+
+Each segment contains:
+- **Offset**: Relative offset from the segment base offset.
+- **Size**: Size of the valid data in bytes (may be followed by padding).
+
+## Named data access
+
+Tensors are accessed by string keys through the `named_data` list. Each entry
+maps a string key to:
+1. A segment index pointing to the raw data.
+2. Optional tensor layout metadata, if the data segment contains a tensor.
+
+This design allows:
+- Multiple named data blobs to reference the same data segment.
+- Access to tensor layout data without loading the entire blob.
+
+## Usage
+
+PTD files are used to store data outside of the PTE file. Some use-cases:
+- On-device training: checkpointing for model weights.
+- Deduplication: sharing model weights between multiple executable PTE files.
+- Flexible deployment: allow async updates between program and data.
diff --git a/extension/flat_tensor/README.md b/extension/flat_tensor/README.md
index 7ece0eb707a..b1d8ed8a8fc 100644
--- a/extension/flat_tensor/README.md
+++ b/extension/flat_tensor/README.md
@@ -1,6 +1,59 @@
 ## FlatTensor
 
-> [!IMPORTANT]
-> FlatTensor is still under development, and not ready to use.
+FlatTensor is a flatbuffer-based format for storing and loading data with string-based keys. The format provides efficient serialization and deserialization of data with metadata and supports C++ and Python APIs. FlatTensor files use the `.ptd` extension.
 
-FlatTensor is a flatbuffer-based format for storing and loading tensors. The format provides a way to store tensors keyed by string.
+Major usage is to store data outside of the PTE file for clean program-data separation. Stored data may be tensor data or opaque blob data (for backends that do not expose data format).
+
+### Schema
+
+[flat_tensor.fbs](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/flat_tensor.fbs) contains the [Flatbuffers](https://google.github.io/flatbuffers/) schema used to serialize ExecuTorch data files.
+
+[flat_tensor_schema.py](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/flat_tensor_schema.py) contains the python definition of the schema types.
+
+### C++ APIs
+
+[serialize.h](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.h) contains the APIs to serialize a PTD file.
+
+[flat_tensor_data_map.h](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/flat_tensor_data_map.h) contains the APIs to deserialize a PTD file and interact with it via the [named_data_map.h](https://github.com/pytorch/executorch/blob/main/runtime/core/named_data_map.h) interface.
+
+### Python APIs
+
+[serialize.py](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.py) contains the Python serialization and deserialization APIs.
+
+### Alignment Considerations
+
+**Segment alignment**: Data segments are aligned to this value. This is usually some multiple of 2. Specified in the [FlatTensorConfig](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.py#L96).
+
+**Tensor alignment**: Tensors are aligned to this value. Specified in the [FlatTensorConfig](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.py#L96).
+
+**Blob alignment**: Blobs (may not be canonical tensors) are aligned to this value. Alignment is specified when blobs are added to the [_named_data_store.py](https://github.com/pytorch/executorch/blob/main/exir/_serialize/_named_data_store.py#L48) and passed to serialize.py.
+
+FlatTensor does not store alignment in the serialized file; the user must ensure the serialized and runtime-expected alignment correspond. The final alignment may be a larger multiple of the specified alignment, as multiple `NamedData` entries can point to a single `DataSegment`. For example:
+```
+BackendA: {key = key1, data = 0x100, alignment = 4}
+BackendB: {key = key2, data = 0x100, alignment = 8}
+```
+BackendA and BackendB are serializing the same bytes, so the data is deduplicated and the final alignment is the lcm of the two, in this case 8.
+
+### Usage
+
+**AoT**
+
+To export a model as a PTE and PTD pair, see [export_program.py](https://github.com/pytorch/executorch/blob/main/test/models/export_program.py). Use the `--external-constants` argument to move all constants to the separate PTD file.
+```
+python -m test.models.export_program --modules "ModuleAddMul" --external-constants --outdir .
+```
+
+To export a delegated model as PTE and PTD pair, see [export_delegated_program.py](https://github.com/pytorch/executorch/blob/main/test/models/export_delegated_program.py). Use the `--external-constants` argument to move all constants to the separate PTD file. Note, ModuleLinear is used here as linear is consumed by the XNNPACK backend.
+```
+python -m test.models.export_delegated_program --modules ModuleLinear --backend_id XnnpackBackend --external_constants --outdir .
+```
+
+**Runtime**
+
+The `ProgramDataSeparationTest` in [method_test.cpp](https://github.com/pytorch/executorch/blob/main/runtime/executor/test/method_test.cpp) demonstrates how to consume the PTD file at runtime.
+
+For a backend example with XNNPACK, see [test_xnn_data_separation.cpp](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/test/runtime/test_xnn_data_separation.cpp).
+
+### Rules to ensure forward/backward compatibility
+See [executorch/schema/README.md](https://github.com/pytorch/executorch/blob/main/schema/README.md).