You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parts of a model may be delegated to run on an optimized backend. The partitioner splits the graph into the appropriate sub-networks and tags them for delegation.
230
230
@@ -263,7 +263,7 @@ Models may lose accuracy after quantization. QAT enables higher accuracy compare
263
263
264
264
Techniques for performing computations and memory accesses on tensors with lower precision data, usually `int8`. Quantization improves model performance by lowering the memory usage and (usually) decreasing computational latency; depending on the hardware, computation done in lower precision will typically be faster, e.g. `int8` matmul vs `fp32` matmul. Often, quantization comes at the cost of model accuracy.
265
265
266
-
## Runtime
266
+
## [Runtime](./runtime-overview.md)
267
267
268
268
The ExecuTorch runtime executes models on edge devices. It is responsible for program initialization, program execution and, optionally, destruction (releasing backend owned resources).
Copy file name to clipboardExpand all lines: docs/source/examples-end-to-end-to-lower-model-to-delegate.md
+21-17Lines changed: 21 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,12 @@
1
1
# Lowering a Model as a Delegate
2
2
3
-
Audience: ML Engineers, who are interested in applying delegates to accelerate their program in runtime
3
+
Audience: ML Engineers, who are interested in applying delegates to accelerate their program in runtime.
4
4
5
5
Backend delegation is an entry point for backends to process and execute PyTorch
6
-
programs to leverage performance and efficiency benefits of specialized
6
+
programs to leverage the performance and efficiency benefits of specialized
7
7
backends and hardware, while still providing PyTorch users with an experience
8
8
close to that of the PyTorch runtime. The backend delegate is usually either provided by
9
-
ExecuTorch or vendors. The way to leverage delegate in your program is via a standard entry point `to_backend`.
9
+
ExecuTorch or vendors. The way to leverage delegation in your program is via a standard entry point `to_backend`.
10
10
11
11
12
12
## Frontend Interfaces
@@ -24,8 +24,7 @@ There are three flows for delegating a program to a backend:
24
24
### Flow 1: Lowering the whole module
25
25
26
26
This flow starts from a traced graph module with Edge Dialect representation. To
27
-
lower it, we call the following function which returns a `LoweredBackendModule`
28
-
(more documentation on this function can be found in the Python API reference):
27
+
lower it, we call the following function which returns a `LoweredBackendModule` (more documentation on this function can be found in the [Export API reference](export-to-executorch-api-reference.rst))
29
28
30
29
```python
31
30
# defined in backend_api.py
@@ -45,7 +44,7 @@ that can be loaded by the runtime.
45
44
The following is an example of this flow:
46
45
47
46
```python
48
-
from executorch.exir.backend.backend_api import to_backend, MethodCompileSpec
47
+
from executorch.exir.backend.backend_api import to_backend
@@ -177,8 +181,8 @@ with open(save_path, "wb") as f:
177
181
178
182
After having the program with delegates, to run the model with the backend, we'd need to register the backend.
179
183
Depending on the delegate implementation, the backend can be registered either as part of global variables or
180
-
explicitly registered inside main function.
184
+
explicitly registered inside the main function.
181
185
182
-
- If it's registered during global variables initialization, the backend will be registered as long as it's static linked. Users only need to include the library as part of the dependency.
186
+
- If it's registered during global variables initialization, the backend will be registered as long as it's statically linked. Users only need to include the library as part of the dependency.
183
187
184
-
- If the vendor provides an API to register the backend, users need to include the library as part of the dependency, and call the API provided by vendors to explicitly register the backend as part of the main function
188
+
- If the vendor provides an API to register the backend, users need to include the library as part of the dependency, and call the API provided by vendors to explicitly register the backend as part of the main function.
0 commit comments