From 94e736fe007c6b398e6c473e1322f3c7e4200eba Mon Sep 17 00:00:00 2001 From: lucylq Date: Fri, 11 Jul 2025 11:08:05 -0700 Subject: [PATCH] Documentation for the PTD file format Differential Revision: D78019434 Pull Request resolved: https://github.com/pytorch/executorch/pull/12316 (cherry picked from commit d506312f8335855b2264bc5388b84595d0b0b225) --- docs/source/index.md | 2 + docs/source/ptd-file-format.md | 144 ++++++++++++++++++++++++++++++++ extension/flat_tensor/README.md | 59 ++++++++++++- 3 files changed, 202 insertions(+), 3 deletions(-) create mode 100644 docs/source/ptd-file-format.md diff --git a/docs/source/index.md b/docs/source/index.md index b9ce82b234c..5f114d547ac 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -75,6 +75,7 @@ ExecuTorch provides support for: - [Platform Abstraction Layer](runtime-platform-abstraction-layer) #### Portable C++ Programming - [PTE File Format](pte-file-format) +- [PTD File Format](ptd-file-format) #### API Reference - [Export to Executorch API Reference](export-to-executorch-api-reference) - [Executorch Runtime API Reference](executorch-runtime-api-reference) @@ -196,6 +197,7 @@ runtime-backend-delegate-implementation-and-linking runtime-platform-abstraction-layer portable-cpp-programming pte-file-format +ptd-file-format ``` ```{toctree} diff --git a/docs/source/ptd-file-format.md b/docs/source/ptd-file-format.md new file mode 100644 index 00000000000..6381e8a071c --- /dev/null +++ b/docs/source/ptd-file-format.md @@ -0,0 +1,144 @@ +# `.ptd` file format + +ExecuTorch `.ptd` files are serialized as modified binary flatbuffer +files with data segments appended. They provide a way to store named data using +the FlatTensor format. Named data can be tensors or opaque blob data (usually for backends that do not expose data format). + +Code related to the PTD file format is in the `//executorch/extension/flat_tensor/` directory. + +``` + ┌───────────────────────────────────┐ + │Standard flatbuffer header │ + ├───────────────────────────────────┤ + │ExecuTorch extended header │ + ├───────────────────────────────────┤ + │Flatbuffer-serialized metadata │ + │(FlatTensor) │ + │ │ + ┌─ ├───────────────────────────────────┤ + │ │Padding │ + │ ├───────────────────────────────────┤ + │ │Data segment │ + │ │ │ + │ │ │ + │ ├───────────────────────────────────┤ + │ │Padding │ + Blobs ─┤ ├───────────────────────────────────┤ + │ │Data segment │ + │ │ │ + │ │ │ + │ ├───────────────────────────────────┤ + │ │Padding │ + │ ├───────────────────────────────────┤ + │ │... │ + └─ └───────────────────────────────────┘ +``` + +## Compatibility + +PTD files are designed for storing named data that can be loaded by ExecuTorch +models. + +## Headers + +PTD files can be recognized by the magic string at byte offset 4, beginning with `FT` +and followed by two ASCII decimal digits (file identifier from the FlatBuffers schema). + +PTD files have an extended header at byte offset 8, recognized by the magic string +`FH01`. This header includes the size and offset information for both the +flatbuffer-serialized metadata and the data segments that follow. + +Note that this header is ExecuTorch-specific, but even when present it does not +upset most flatbuffer-parsing code (apart from the rarely-used +`GetBufferStartFromRootPointer()`). + +All numbers are little-endian, regardless of the host system. + +Header layout: +``` +[0..3] uint32_t byte offset to the beginning of the flatbuffer root table. +[4..7] File magic bytes: "FT" followed by two ASCII decimal digits. The digits + correspond to the FlatBuffers file identifier. +Extended header (always present): +| [8..11] Extended header magic bytes: "FH01" - FlatTensor Header version 01. +| [12..15] uint32_t size of this extended header in bytes, including the magic +| header and this size field. Currently fixed at 40 bytes. +| [16..23] uint64_t offset (from byte offset zero) to the start of the +| flatbuffer data. +| [24..31] uint64_t size of the flatbuffer-encoded tensor metadata in bytes. +| [32..39] uint64_t offset (from byte offset zero) to the start of the first +| data segment. +| [40..47] uint64_t total size of all data segments in bytes. +End of extended header. +``` + +Example: +``` + Offset to flatbuffer root (0x44) + | File magic ("FT01") + | | Extended header magic ("FH01") + | | | Extended header size (0x28) + vvvvvvvvvvv vvvvvvvvvvv vvvvvvvvvvv vvvvvvvvvvv +0x0000 44 00 00 00 46 54 30 31 46 48 30 31 28 00 00 00 +0x0010 30 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 +0x0020 30 01 00 00 00 00 00 00 20 00 00 00 00 00 00 00 + ^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^ + | | Flatbuffer size (0x100) + | | Segment data size (0x20) + Segment base offset (0x130) +``` +Note: this example comes from inspecting the ModuleAddMul.ptd file. +``` +python -m test.models.export_program --modules "ModuleAddMul" --external-constants --outdir . + +xxd -l 64 ModuleAddMulProgram.ptd +``` + +## FlatTensor + +See `//executorch/extension/flat_tensor/serialize/flat_tensor.fbs` for the +FlatTensor flatbuffer schema. + +The flatbuffer-encoded metadata follows the headers and contains: + +- **Schema version**: Version information for compatibility. +- **Data segments**: List of segment descriptors with offset and size information. +- **Named data**: List of named data entries, each containing: + - **Key**: String identifier for the data blob. + - **Segment index**: Reference to the data segment containing the blob. + - **Tensor layout**: Optional metadata including scalar type, sizes and dim order, if the data segment contains a tensor. + +### Tensor Layout + +If a data segment contains a canonical tensor, it may have associated layout information: +- **Scalar type**: Data type (float32, int32, etc.) using ExecutorTorch scalar types. +- **Sizes**: Dimensions of the tensor. +- **Dim order**: Memory layout order specifying how dimensions are arranged in memory. + +## Data segments + +The `FlatTensor.segments` list in the metadata contains offset and size +information about each data segment. Offsets in this list are relative to +the segment base offset specified in the extended header. + +Each segment contains: +- **Offset**: Relative offset from the segment base offset. +- **Size**: Size of the valid data in bytes (may be followed by padding). + +## Named data access + +Tensors are accessed by string keys through the `named_data` list. Each entry +maps a string key to: +1. A segment index pointing to the raw data. +2. Optional tensor layout metadata, if the data segment contains a tensor. + +This design allows: +- Multiple named data blobs to reference the same data segment. +- Access to tensor layout data without loading the entire blob. + +## Usage + +PTD files are used to store data outside of the PTE file. Some use-cases: +- On-device training: checkpointing for model weights. +- Deduplication: sharing model weights between multiple executable PTE files. +- Flexible deployment: allow async updates between program and data. diff --git a/extension/flat_tensor/README.md b/extension/flat_tensor/README.md index 7ece0eb707a..b1d8ed8a8fc 100644 --- a/extension/flat_tensor/README.md +++ b/extension/flat_tensor/README.md @@ -1,6 +1,59 @@ ## FlatTensor -> [!IMPORTANT] -> FlatTensor is still under development, and not ready to use. +FlatTensor is a flatbuffer-based format for storing and loading data with string-based keys. The format provides efficient serialization and deserialization of data with metadata and supports C++ and Python APIs. FlatTensor files use the `.ptd` extension. -FlatTensor is a flatbuffer-based format for storing and loading tensors. The format provides a way to store tensors keyed by string. +Major usage is to store data outside of the PTE file for clean program-data separation. Stored data may be tensor data or opaque blob data (for backends that do not expose data format). + +### Schema + +[flat_tensor.fbs](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/flat_tensor.fbs) contains the [Flatbuffers](https://google.github.io/flatbuffers/) schema used to serialize ExecuTorch data files. + +[flat_tensor_schema.py](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/flat_tensor_schema.py) contains the python definition of the schema types. + +### C++ APIs + +[serialize.h](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.h) contains the APIs to serialize a PTD file. + +[flat_tensor_data_map.h](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/flat_tensor_data_map.h) contains the APIs to deserialize a PTD file and interact with it via the [named_data_map.h](https://github.com/pytorch/executorch/blob/main/runtime/core/named_data_map.h) interface. + +### Python APIs + +[serialize.py](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.py) contains the Python serialization and deserialization APIs. + +### Alignment Considerations + +**Segment alignment**: Data segments are aligned to this value. This is usually some multiple of 2. Specified in the [FlatTensorConfig](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.py#L96). + +**Tensor alignment**: Tensors are aligned to this value. Specified in the [FlatTensorConfig](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/serialize/serialize.py#L96). + +**Blob alignment**: Blobs (may not be canonical tensors) are aligned to this value. Alignment is specified when blobs are added to the [_named_data_store.py](https://github.com/pytorch/executorch/blob/main/exir/_serialize/_named_data_store.py#L48) and passed to serialize.py. + +FlatTensor does not store alignment in the serialized file; the user must ensure the serialized and runtime-expected alignment correspond. The final alignment may be a larger multiple of the specified alignment, as multiple `NamedData` entries can point to a single `DataSegment`. For example: +``` +BackendA: {key = key1, data = 0x100, alignment = 4} +BackendB: {key = key2, data = 0x100, alignment = 8} +``` +BackendA and BackendB are serializing the same bytes, so the data is deduplicated and the final alignment is the lcm of the two, in this case 8. + +### Usage + +**AoT** + +To export a model as a PTE and PTD pair, see [export_program.py](https://github.com/pytorch/executorch/blob/main/test/models/export_program.py). Use the `--external-constants` argument to move all constants to the separate PTD file. +``` +python -m test.models.export_program --modules "ModuleAddMul" --external-constants --outdir . +``` + +To export a delegated model as PTE and PTD pair, see [export_delegated_program.py](https://github.com/pytorch/executorch/blob/main/test/models/export_delegated_program.py). Use the `--external-constants` argument to move all constants to the separate PTD file. Note, ModuleLinear is used here as linear is consumed by the XNNPACK backend. +``` +python -m test.models.export_delegated_program --modules ModuleLinear --backend_id XnnpackBackend --external_constants --outdir . +``` + +**Runtime** + +The `ProgramDataSeparationTest` in [method_test.cpp](https://github.com/pytorch/executorch/blob/main/runtime/executor/test/method_test.cpp) demonstrates how to consume the PTD file at runtime. + +For a backend example with XNNPACK, see [test_xnn_data_separation.cpp](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/test/runtime/test_xnn_data_separation.cpp). + +### Rules to ensure forward/backward compatibility +See [executorch/schema/README.md](https://github.com/pytorch/executorch/blob/main/schema/README.md).