You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/kernel-library-selective-build.md
+33-31Lines changed: 33 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,17 +36,20 @@ The basic flow looks like this:
36
36
37
37
## APIs
38
38
39
-
We expose a CMake macro `[gen_selected_ops](https://github.com/pytorch/executorch/blob/main/tools/cmake/Codegen.cmake#L12)`, to allow users specifying op info:
39
+
We expose a CMake macro [gen_selected_ops](https://github.com/pytorch/executorch/blob/main/tools/cmake/Codegen.cmake#L12), to allow users specifying op info:
40
40
41
41
```
42
42
gen_selected_ops(
43
-
LIB_NAME # the name of the selective build operator library to be generated
44
-
OPS_SCHEMA_YAML # path to a yaml file containing operators to be selected
45
-
ROOT_OPS # comma separated operator names to be selected
46
-
INCLUDE_ALL_OPS # boolean flag to include all operators
43
+
LIB_NAME # the name of the selective build operator library to be generated
44
+
OPS_SCHEMA_YAML # path to a yaml file containing operators to be selected
45
+
ROOT_OPS # comma separated operator names to be selected
46
+
INCLUDE_ALL_OPS # boolean flag to include all operators
47
+
OPS_FROM_MODEL # path to a pte file of model to select operators from
48
+
DTYPE_SELECTIVE_BUILD # boolean flag to enable dtye selection
47
49
)
48
50
```
49
51
52
+
The macro makes a call to gen_oplist.py, which requires a [distinct selection](https://github.com/BujSet/executorch/blob/main/codegen/tools/gen_oplist.py#L222-L228) of API choice. `OPS_SCHEMA_YAML`, `ROOT_OPS`, `INCLUDE_ALL_OPS`, and `OPS_FROM_MODEL` are mutually exclusive options, and should not be used in conjunction.
50
53
51
54
### Select all ops
52
55
@@ -62,40 +65,39 @@ Context: each kernel library is designed to have a yaml file associated with it.
62
65
63
66
This API lets users pass in a list of operator names. Note that this API can be combined with the API above and we will create a allowlist from the union of both API inputs.
64
67
68
+
### Select ops from model
69
+
70
+
This API lets users pass in a pte file of an exported model. When used, the pte file will be parsed to generate a yaml file that enumerates the operators and dtypes used in the model.
71
+
72
+
### Dtype Selective Build
73
+
74
+
Beyond pruning the binary to remove unused operators, the binary size can further reduced by removing unused dtypes. For example, if your model only uses floats for the `add` operator, then including variants of the `add` operators for `doubles` and `ints` is unnecessary. The flag `DTYPE_SELECTIVE_BUILD` can be set to `ON` to support this additional optimization. Currently, dtype selective build is only supported with the model API described above. Once enabled, a header file that specifies only the operators and dtypes used by the model is created and linked against a rebuild of the `portable_kernels` lib. This feature is only supported for the portable kernels library; it's not supported for optimized, quantized or custom kernel libraries.
In [CMakeLists.txt](https://github.com/BujSet/executorch/blob/main/examples/selective_build/CMakeLists.txt#L48-L72), we have the following cmake config options:
These options allow a user to tailor the cmake build process to utilize the different APIs, and results in different invocations on the `gen_selected_ops`[function](https://github.com/BujSet/executorch/blob/main/examples/selective_build/CMakeLists.txt#L110-L123). The following table describes some examples of how the invocation changes when these configs are set:
87
87
88
-
```
89
-
cmake -D… -DSELECT_OPS_YAML=ON
90
-
```
88
+
| Example cmake Call | Resultant `gen_selected_ops` Invocation |
To select from either an operator name list or a schema yaml from kernel library.
93
95
94
-
## Manual Kernel Registration with '--lib-name'
95
-
ExecuTorch now supports generating library-specific kernel registration APIs using the '--lib-name' option along with '--manual-registration' during codegen. This allows applications to avoid using static initialization or linker flags like '-force_load' when linking in kernel libraries.
96
+
## Manual Kernel Registration with `--lib-name`
97
+
ExecuTorch now supports generating library-specific kernel registration APIs using the `--lib-name` option along with `--manual-registration` during codegen. This allows applications to avoid using static initialization or linker flags like `-force_load` when linking in kernel libraries.
96
98
97
99
## Motivation
98
-
In environments like Xcode, using static libraries requires developers to manually specify '-force_load' flags to ensure kernel registration code is executed. This is inconvenient and error-prone.
100
+
In environments like Xcode, using static libraries requires developers to manually specify `-force_load` flags to ensure kernel registration code is executed. This is inconvenient and error-prone.
99
101
100
102
By passing a library name to the codegen script, developers can generate explicit registration functions and headers, which they can call directly in their application.
101
103
@@ -110,7 +112,7 @@ python -m codegen.gen \
110
112
```
111
113
This will generate:
112
114
113
-
'register_custom_kernels.cpp' defines 'register_custom_kernels()' with only the kernels selected and 'register_custom_kernels.h' declares the function for inclusion in your application
115
+
`register_custom_kernels.cpp` defines `register_custom_kernels()` with only the kernels selected and `register_custom_kernels.h` declares the function for inclusion in your application
114
116
115
117
Then in your application, call:
116
118
@@ -123,4 +125,4 @@ register_custom_kernels(); // Registers only the "custom" kernels
123
125
This avoids relying on static initialization and enables you to register only the kernels you want.
124
126
125
127
### Compatibility
126
-
If '--lib-name' is not passed, the default behavior remains unchanged, the codegen script will generate a general 'RegisterKernels.cpp' and 'register_all_kernels()' function.
128
+
If `--lib-name` is not passed, the default behavior remains unchanged, the codegen script will generate a general `RegisterKernels.cpp` and `register_all_kernels()` function.
0 commit comments