Skip to content

Commit b1d0900

Browse files
lucylqdbort
authored andcommitted
executorch->ExecuTorch, other nits (#953)
Summary: Pull Request resolved: #953 ^ Reviewed By: larryliu0820 Differential Revision: D50329761 fbshipit-source-id: c3c2718eba5af987b17935a2a515fb7e05db6229
1 parent 082e8e7 commit b1d0900

File tree

6 files changed

+18
-18
lines changed

6 files changed

+18
-18
lines changed

docs/source/concepts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -259,7 +259,7 @@ A quantization technique where the model is quantized after it has been trained
259259

260260
Models may lose accuracy after quantization. QAT enables higher accuracy compared to eg. PTQ, by modeling the effects of quantization while training. During training, all weights and activations are ‘fake quantized’; float values are rounded to mimic int8 values, but all computations are still done with floating point numbers. Thus, all weight adjustments during training are made ‘aware’ that the model will ultimately be quantized. QAT applies the quantization flow during training, in contrast to PTQ which applies it afterwards.
261261

262-
## Quantization
262+
## [Quantization](./quantization-overview.md)
263263

264264
Techniques for performing computations and memory accesses on tensors with lower precision data, usually `int8`. Quantization improves model performance by lowering the memory usage and (usually) decreasing computational latency; depending on the hardware, computation done in lower precision will typically be faster, e.g. `int8` matmul vs `fp32` matmul. Often, quantization comes at the cost of model accuracy.
265265

docs/source/quantization-overview.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# Quantization Overview
2-
Quantization is a process that reduces the precision of computations and lowers memory footprint in the model. To learn more, please visit the [ExecuTorch concepts page](./concepts.md#quantization). This is particularly useful for edge devices, which typically have limited resources such as processing power, memory, and battery life. By using quantization, we can make our models more efficient and enable them to run effectively on these devices.
2+
Quantization is a process that reduces the precision of computations and lowers memory footprint in the model. To learn more, please visit the [ExecuTorch concepts page](./concepts.md#quantization). This is particularly useful for edge devices including wearables, embedded devices and microcontrollers, which typically have limited resources such as processing power, memory, and battery life. By using quantization, we can make our models more efficient and enable them to run effectively on these devices.
33

44
In terms of flow, quantization happens early in the ExecuTorch stack:
55

6-
![ExecuTorch Entry Points](/_static/img/executorch-entry-points.png).
6+
![ExecuTorch Entry Points](/_static/img/executorch-entry-points.png)
77

88
A more detailed workflow can be found in the [ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).
99

1010
Quantization is usually tied to execution backends that have quantized operators implemented. Thus each backend is opinionated about how the model should be quantized, expressed in a backend specific ``Quantizer`` class. ``Quantizer`` provides API for modeling users in terms of how they want their model to be quantized and also passes on the user intention to quantization workflow.
1111

12-
Backend developers will need to implement their own ``Quantizer`` to express how different operators or operator patterns are quantized in their backend. This is accomplished via [Annotation API](https://pytorch.org/tutorials/prototype/pt2e_quantizer.html) provided by quantization workflow. Since Quantizer is also user facing, it will expose specific APIs for modeling users to configure how they want the model to be quantized. Each backend should provide their own API documentation for their ``Quantizer``.
12+
Backend developers will need to implement their own ``Quantizer`` to express how different operators or operator patterns are quantized in their backend. This is accomplished via [Annotation API](https://pytorch.org/tutorials/prototype/pt2e_quantizer.html) provided by quantization workflow. Since ``Quantizer`` is also user facing, it will expose specific APIs for modeling users to configure how they want the model to be quantized. Each backend should provide their own API documentation for their ``Quantizer``.
1313

14-
Modeling user will use the ``Quantizer`` specific to their target backend to quantize their model, e.g. ``XNNPACKQuantizer``.
14+
Modeling users will use the ``Quantizer`` specific to their target backend to quantize their model, e.g. ``XNNPACKQuantizer``.
1515

16-
For an example quantization flow with ``XNPACKQuantizer``, more docuemntations and tutorials, please see ``Performing Quantization`` section in [ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).
16+
For an example quantization flow with ``XNPACKQuantizer``, more documentation and tutorials, please see ``Performing Quantization`` section in [ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).

docs/source/running-a-model-cpp-tutorial.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ assert(execute_error == Error::Ok);
133133
134134
## Retrieve Outputs
135135
136-
Once our inference completes we can retrieve our output. We know that our model only returns a single output tensor. One potential pitfall here is that the output we get back is owned by the `Method`. Users should take care to clone their output before performing any mutations on it, or if they need it to have a lifespan seperate from the `Method`.
136+
Once our inference completes we can retrieve our output. We know that our model only returns a single output tensor. One potential pitfall here is that the output we get back is owned by the `Method`. Users should take care to clone their output before performing any mutations on it, or if they need it to have a lifespan separate from the `Method`.
137137
138138
``` cpp
139139
EValue output = method->get_output(0);

runtime/core/memory_allocator.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ namespace executor {
3333
* MemoryAllocator allocator(100, memory_pool)
3434
* // Pass allocator object in the Executor
3535
*
36-
* Underneath the hood, ExecuTorch will
36+
* Underneath the hood, ExecuTorch will call
3737
* allocator.allocate() to keep iterating cur_ pointer
3838
*/
3939
class MemoryAllocator {
@@ -46,8 +46,8 @@ class MemoryAllocator {
4646
static constexpr size_t kDefaultAlignment = alignof(void*);
4747

4848
/**
49-
* Constructs a new memory allocator of a given 'size', starting at the
50-
* provided 'base_address'.
49+
* Constructs a new memory allocator of a given `size`, starting at the
50+
* provided `base_address`.
5151
*
5252
* @param[in] size The size in bytes of the buffer at `base_address`.
5353
* @param[in] base_address The buffer to allocate from. Does not take
@@ -121,7 +121,7 @@ class MemoryAllocator {
121121
}
122122

123123
/**
124-
* Allocates 'size' number of chunks of type T, where each chunk is of size
124+
* Allocates `size` number of chunks of type T, where each chunk is of size
125125
* equal to sizeof(T) bytes.
126126
*
127127
* @param[in] size Number of memory chunks to allocate.

runtime/executor/memory_manager.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ class MemoryManager final {
9393
}
9494

9595
/**
96-
* Returns the allocator to use to allocate temporary data during kernel or
96+
* Returns the allocator to use for allocating temporary data during kernel or
9797
* delegate execution.
9898
*
9999
* This allocator will be reset after every kernel or delegate call during

runtime/executor/program.h

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -175,23 +175,23 @@ class Program final {
175175
const char* method_name = "forward") const;
176176

177177
/**
178-
* Describes the presence of an executorch program header.
178+
* Describes the presence of an ExecuTorch program header.
179179
*/
180180
enum HeaderStatus {
181181
/**
182-
* An executorch program header is present, and its version is compatible
182+
* An ExecuTorch program header is present, and its version is compatible
183183
* with this version of the runtime.
184184
*/
185185
CompatibleVersion,
186186

187187
/**
188-
* An executorch program header is present, but its version is not
188+
* An ExecuTorch program header is present, but its version is not
189189
* compatible with this version of the runtime.
190190
*/
191191
IncompatibleVersion,
192192

193193
/**
194-
* An executorch program header is not present.
194+
* An ExecuTorch program header is not present.
195195
*/
196196
NotPresent,
197197

@@ -207,10 +207,10 @@ class Program final {
207207
static constexpr size_t kMinHeadBytes = 64;
208208

209209
/**
210-
* Looks for an executorch program header in the provided data.
210+
* Looks for an ExecuTorch program header in the provided data.
211211
*
212212
* @param[in] data The data from the beginning of a file that might contain
213-
* an executorch program.
213+
* an ExecuTorch program.
214214
* @param[in] size The size of `data` in bytes. Must be >= `kMinHeadBytes`.
215215
*
216216
* @returns A value describing the presence of a header in the data.

0 commit comments

Comments
 (0)