Skip to content

Commit ad9a299

Browse files
guangy10mergennachin
authored andcommitted
getting-started-architecture.md (#900)
Summary: Pull Request resolved: #900 - Fix links. Links are pretty much all wrong in this tutorial and some are failed links - rephrase Reviewed By: JacobSzwejbka Differential Revision: D50272977 fbshipit-source-id: 5f4d68dca0a6fa6022cc3e01a6ab3c2b809841b1
1 parent 171ebce commit ad9a299

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

docs/source/getting-started-architecture.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ In order to target on-device AI with diverse hardware, critical power requiremen
99

1010
## Overview
1111

12-
There are three steps to deploy a PyTorch model to on-device: program preparation, runtime preparation, and program execution, as shown in the diagram below, with a number of user entry points. We’ll discuss each step separately in this documentation.
12+
There are three phases to deploy a PyTorch model to on-device: program preparation, runtime preparation, and program execution, as shown in the diagram below, with a number of user entry points. We’ll discuss each step separately in this documentation.
1313

1414

1515
![](./executorch_stack.png)
@@ -20,7 +20,7 @@ There are three steps to deploy a PyTorch model to on-device: program preparatio
2020

2121
## Program Preparation
2222

23-
ExecuTorch extends the flexibility and usability of PyTorch to edge devices. It leverages PyTorch 2.0 compiler and export functionality ([TorchDynamo](https://pytorch.org/docs/stable/dynamo/index.html), [AOTAutograd](https://pytorch.org/functorch/stable/notebooks/aot_autograd_optimizations.html), [Quantization](https://pytorch.org/docs/main/quantization.html), dynamic shapes, control flow, etc.) to prepare a PyTorch program for execution on devices.
23+
ExecuTorch extends the flexibility and usability of PyTorch to edge devices. It leverages PyTorch 2 compiler and export functionality ([TorchDynamo](https://pytorch.org/docs/stable/dynamo/index.html), [AOTAutograd](https://pytorch.org/functorch/stable/notebooks/aot_autograd_optimizations.html), [Quantization](https://pytorch.org/docs/main/quantization.html), dynamic shapes, control flow, etc.) to prepare a PyTorch program for execution on devices.
2424

2525
Program preparation is often simply called AOT (ahead-of-time) because export, transformations and compilations to the program are performed before it is eventually run with the ExecuTorch runtime, written in C++. To have a lightweight runtime and small overhead in execution, we push work as much as possible to AOT.
2626

@@ -37,13 +37,13 @@ Starting from the program source code, below are the steps you would go through
3737

3838
### Export
3939

40-
To deploy the program to the device, engineers need to have a graph representation for compiling a model to run on various backends. With torch.export, an [EXIR](https://github.com/pytorch/executorch/blob/main/docs/website/docs/ir_spec/00_exir.md) (export intermediate representation) is generated with ATen dialect. All AOT compilations are based on this EXIR, but can have multiple dialects along the lowering path as detailed below.
40+
To deploy the program to the device, engineers need to have a graph representation for compiling a model to run on various backends. With [`torch.export()`](https://pytorch.org/docs/main/export.html), an [EXIR](./ir-exir.md) (export intermediate representation) is generated with ATen dialect. All AOT compilations are based on this EXIR, but can have multiple dialects along the lowering path as detailed below.
4141

4242

4343

44-
* _[ATen Dialect](https://github.com/pytorch/executorch/blob/main/docs/website/docs/ir_spec/01_aten_dialect.md)_. PyTorch Edge is based on PyTorch’s Tensor library ATen, which has clear contracts for efficient execution. ATen Dialect is a graph represented by ATen nodes which are fully ATen compliant. Custom operators are allowed, but must be registered with the dispatcher. It’s flatten with no module hierarchy (submodules in a bigger module), but the source code and module hierarchy are preserved in the metadata. This representation is also autograd safe.
44+
* _[ATen Dialect](./ir-exir.md#aten-dialect)_. PyTorch Edge is based on PyTorch’s Tensor library ATen, which has clear contracts for efficient execution. ATen Dialect is a graph represented by ATen nodes which are fully ATen compliant. Custom operators are allowed, but must be registered with the dispatcher. It’s flatten with no module hierarchy (submodules in a bigger module), but the source code and module hierarchy are preserved in the metadata. This representation is also autograd safe.
4545
* Optionally, _quantization_, either QAT (quantization-aware training) or PTQ (post training quantization) can be applied to the whole ATen graph before converting to Core ATen. Quantization helps with reducing the model size, which is important for edge devices.
46-
* _[Core ATen Dialect](https://github.com/pytorch/executorch/blob/main/docs/website/docs/ir_spec/01_aten_dialect.md)_. ATen has thousands of operators. It’s not ideal for some fundamental transforms and kernel library implementation. The operators from the ATen Dialect graph are decomposed into fundamental operators so that the operator set (op set) is smaller and more fundamental transforms can be applied. The Core ATen dialect is also serializable and convertible to Edge Dialect as detailed below.
46+
* _[Core ATen Dialect](./ir-ops-set-definition.md)_. ATen has thousands of operators. It’s not ideal for some fundamental transforms and kernel library implementation. The operators from the ATen Dialect graph are decomposed into fundamental operators so that the operator set (op set) is smaller and more fundamental transforms can be applied. The Core ATen dialect is also serializable and convertible to Edge Dialect as detailed below.
4747

4848

4949
### Edge Compilation
@@ -52,13 +52,13 @@ The Export process discussed above operates on a graph that is agnostic to the e
5252

5353

5454

55-
* _[Edge Dialect](https://github.com/pytorch/executorch/blob/main/docs/website/docs/ir_spec/02_edge_dialect.md)_. All operators are either compliant with ATen operators with dtype plus memory layout information (represented as `dim_order`) or registered custom operators. Scalars are converted to Tensors. Those specifications allow following steps focusing on a smaller Edge domain. In addition, it enables the selective build which is based on specific dtypes and memory layouts.
55+
* _[Edge Dialect](./ir-exir.md#edge-dialect)_. All operators are either compliant with ATen operators with dtype plus memory layout information (represented as `dim_order`) or registered custom operators. Scalars are converted to Tensors. Those specifications allow following steps focusing on a smaller Edge domain. In addition, it enables the selective build which is based on specific dtypes and memory layouts.
5656

57-
With the Edge dialect, there are two target-aware ways to further lower the graph to the _[Backend Dialect](https://github.com/pytorch/executorch/blob/main/docs/website/docs/ir_spec/03_backend_dialect.md)_. At this point, delegates for specific hardware can perform many operations. For example, CoreML on iOS, QNN on Qualcomm, or TOSA on Arm can rewrite the graph. The options at this level are:
57+
With the Edge dialect, there are two target-aware ways to further lower the graph to the _[Backend Dialect](./compiler-backend-dialect.md)_. At this point, delegates for specific hardware can perform many operations. For example, CoreML on iOS, QNN on Qualcomm, or TOSA on Arm can rewrite the graph. The options at this level are:
5858

5959

6060

61-
* _[Backend Delegate](https://github.com/pytorch/executorch/blob/main/docs/website/docs/tutorials/backend_delegate.md)_. The entry point to compile the graph (either full or partial) to a specific backend. The compiled graph is swapped with the semantically equivalent graph during this transformation. The compiled graph will be offloaded to the backend (aka `delegated`) later during the runtime for improved performance.
61+
* _[Backend Delegate](./compiler-delegate-and-partitioner.md)_. The entry point to compile the graph (either full or partial) to a specific backend. The compiled graph is swapped with the semantically equivalent graph during this transformation. The compiled graph will be offloaded to the backend (aka `delegated`) later during the runtime for improved performance.
6262
* _User-defined passes_. Target-specific transforms can also be performed by the user. Good examples of this are kernel fusion, async behavior, memory layout conversion, and others.
6363

6464

@@ -81,7 +81,7 @@ Finally, the emitted program can be serialized to [flatbuffer](https://github.co
8181

8282
With the serialized program, and provided kernel libraries (for operator calls) or backend libraries (for delegate calls), model deployment engineers can now prepare the program for the runtime.
8383

84-
ExecuTorch has the _[selective build](https://github.com/pytorch/executorch/blob/main/docs/website/docs/tutorials/selective_build.md)_ APIs, to build the runtime that links to only kernels used by the program, which can provide significant binary size savings in the resulting application.
84+
ExecuTorch has the _[selective build](./kernel-library-selective_build.md)_ APIs, to build the runtime that links to only kernels used by the program, which can provide significant binary size savings in the resulting application.
8585

8686

8787
## Program Execution
@@ -101,6 +101,6 @@ _Executor_ is the entry point to load the program and execute it. The execution
101101

102102
## SDK
103103

104-
It should be efficient for users to go from research to production using the flow above. Productivity is essentially important, for users to author, optimize and deploy their models. We provide [ExecuTorch SDK](https://github.com/pytorch/executorch/tree/main/docs/website/docs/sdk) to improve productivity. The SDK is not in the diagram. Instead it's a tool set that covers the developer workflow in all three phases.
104+
It should be efficient for users to go from research to production using the flow above. Productivity is essentially important, for users to author, optimize and deploy their models. We provide [ExecuTorch SDK](./sdk-overview.md) to improve productivity. The SDK is not in the diagram. Instead it's a tool set that covers the developer workflow in all three phases.
105105

106106
During the program preparation and execution, users can use the ExecuTorch SDK to profile, debug, or visualize the program. Since the end-to-end flow is within the PyTorch ecosystem, users can correlate and display performance data along with graph visualization as well as direct references to the program source code and model hierarchy. We consider this to be a critical component for quickly iterating and lowering PyTorch programs to edge devices and environments.

0 commit comments

Comments
 (0)