Skip to content

Commit 3c5a195

Browse files
IanWood1Elias Joseph
authored andcommitted
[docs] Update Optimization Options (#20216)
Added `iree-opt-level` to docs and add tip pointing to the "Optimization Options" page. --------- Signed-off-by: Ian Wood <[email protected]> Signed-off-by: Elias Joseph <[email protected]>
1 parent adb6921 commit 3c5a195

File tree

7 files changed

+90
-9
lines changed

7 files changed

+90
-9
lines changed

docs/website/docs/guides/deployment-configurations/bare-metal.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ for example command-line instructions of some common architectures.
6161
You can replace the MLIR file with the other MLIR model files, following the
6262
[instructions](./cpu.md#compile-a-program).
6363

64+
--8<-- "docs/website/docs/guides/deployment-configurations/snippets/_iree-optimization-options.md"
65+
6466
### Compiling the bare-metal model for static-library support
6567

6668
See the [static_library](https://github.com/iree-org/iree/tree/main/samples/static_library)

docs/website/docs/guides/deployment-configurations/cpu.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,8 @@ iree-compile \
130130
When not cross compiling, passing `--iree-llvmcpu-target-cpu=host` is
131131
usually sufficient on most devices.
132132

133+
--8<-- "docs/website/docs/guides/deployment-configurations/snippets/_iree-optimization-options.md"
134+
133135
#### Choosing CPU targets
134136

135137
The `--iree-llvmcpu-target-triple` flag tells the compiler to generate code

docs/website/docs/guides/deployment-configurations/gpu-cuda.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,8 @@ iree-compile \
9595
mobilenetv2.mlir -o mobilenet_cuda.vmfb
9696
```
9797

98+
--8<-- "docs/website/docs/guides/deployment-configurations/snippets/_iree-optimization-options.md"
99+
98100
#### Choosing CUDA targets
99101

100102
Canonically a CUDA target (`iree-cuda-target`) matching the LLVM NVPTX

docs/website/docs/guides/deployment-configurations/gpu-rocm.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,8 @@ iree-compile \
167167
mobilenetv2.mlir -o mobilenet_rocm.vmfb
168168
```
169169

170+
--8<-- "docs/website/docs/guides/deployment-configurations/snippets/_iree-optimization-options.md"
171+
170172
???+ tip "Tip - HIP bitcode files"
171173

172174
That IREE comes with bundled bitcode files, which are used for linking

docs/website/docs/guides/deployment-configurations/gpu-vulkan.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,8 @@ iree-compile \
165165
mobilenetv2.mlir -o mobilenet_vulkan.vmfb
166166
```
167167

168+
--8<-- "docs/website/docs/guides/deployment-configurations/snippets/_iree-optimization-options.md"
169+
168170
#### Choosing Vulkan targets
169171

170172
The `--iree-vulkan-target` specifies the GPU architecture to target. It
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
???+ tip "Tip - Compiler Optimizations"
2+
3+
Use `--iree-opt-level=[O0,O1,O2,O3]` to enable additional compiler
4+
optimizations. The default value of `O0` enables only minimal optimizations
5+
while higher levels enable progressively more aggressive optimizations. See
6+
[Optimization Options](../../reference/optimization-options.md) for more details.

docs/website/docs/reference/optimization-options.md

Lines changed: 74 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,80 @@ These flags can be passed to the:
1616
constructor
1717
* `ireeCompilerOptionsSetFlags()` compiler C API function
1818

19+
## Optimization level
20+
21+
As in other compilers like clang and gcc, IREE provides a high level optimization
22+
level flag (`iree-opt-level`) that enables different sets of underlying options.
23+
24+
`iree-opt-level` specifies the optimization level for the entire compilation
25+
flow. Lower optimization levels prioritize debuggability and stability, while
26+
higher levels focus on maximizing performance. By default, `iree-opt-level` is
27+
set to `O0` (minimal or no optimizations).
28+
29+
!!! note
30+
31+
Not all flags that control performance are nested under `iree-opt-level`.
32+
See [High level program optimizations](#high-level-program-optimizations)
33+
below for subflags not covered by optimization flags.
34+
35+
This flag takes the following values:
36+
37+
| Optimization Level | Pros | Cons |
38+
|-------------------|------|------|
39+
| **O0** (Default, Minimal Optimizations) | <ul style="list-style-type:none;"><li>✔️ Fastest compilation time.</li><li>✔️ Generated code is easier to debug.</li><li>✔️ Keeps assertions enabled</li></ul> | <ul style="list-style-type:none;"><li>❌ Poor runtime performance.</li><li>❌ Higher runtime memory usage.</li><li>❌ Larger code size due to lack of optimization.</li></ul> |
40+
| **O1** (Basic Optimizations) | <ul style="list-style-type:none;"><li>✔️ Enables optimizations, allowing for better runtime performance.</li><li>✔️ Optimizations are compatible with all backends.</li></ul> | <ul style="list-style-type:none;"><li>➖ Only applies conservative optimizations.</li><li>❌ Reduced debuggability.</li></ul> |
41+
| **O2** (Optimizations without full backend support) | <ul style="list-style-type:none;"><li>✔️ Even more aggressive optimizations.</li><li>✔️ Strikes a balance between optimization level and compatibility.</li></ul> | <ul style="list-style-type:none;"><li>➖ Some optimizations may not be supported by all backends.</li><li>❌ Reduced debuggability.</li></ul> |
42+
| **O3** (Aggressive Optimization) | <ul style="list-style-type:none;"><li>✔️ Highest runtime performance.</li><li>✔️ Enables advanced and aggressive transformations.</li><li>✔️ Exploits backend-specific optimizations for optimal efficiency.</li></ul> | <ul style="list-style-type:none;"><li>➖ Longer compile times.</li><li>❌ Some optimizations may be unstable.</li><li>❌ Reduced debuggability.</li></ul> |
43+
44+
Although `iree-opt-level` sets the default for each subflag, they can be
45+
explicitly set on or off independently.
46+
47+
For example:
48+
49+
```bash
50+
# Apply the default optimizations of `O2` but don't remove assertions.
51+
iree-compile --iree-opt-level=O2 --iree-strip-assertions=false
52+
53+
# Minimize optimizations, but still preform aggressive fusion.
54+
iree-compile --iree-opt-level=O0 --iree-dispatch-creation-enable-aggressive-fusion=true
55+
```
56+
57+
### Pipeline-level control
58+
59+
In addition to `iree-opt-level`, IREE provides optimization controls at the
60+
pipeline level. These flags allow fine-grained tuning of specific compilation
61+
stages while still respecting the topmost optimization level unless explicitly
62+
overridden.
63+
64+
#### Dispatch Creation (`iree-dispatch-creation-opt-level`)
65+
66+
- `iree-dispatch-creation-enable-aggressive-fusion` (enabled at `O2`)
67+
68+
Enables more aggressive fusion opportunities not yet supported by all backends
69+
70+
#### Global Optimization (`iree-global-optimization-opt-level`)
71+
72+
- `iree-opt-strip-assertions` (enabled at `O1`)
73+
74+
Strips all `std.assert` ops in the input program after useful information for
75+
optimization analysis has been extracted. Assertions provide useful
76+
user-visible error messages but can prevent critical optimizations.
77+
Assertions are not, however, a substitution for control flow and frontends
78+
that want to check errors in optimized release builds should do so via
79+
actual code - similar to when one would `if (foo) return false;` vs.
80+
`assert(foo);` in a normal program.
81+
82+
- `iree-opt-outer-dim-concat` (enabled at `O1`)
83+
84+
Transpose concat operations to ocurr along the outermost dimension. The
85+
resulting concat will now be contiguous and the inserted transposes can
86+
possibly be fused with surrounding ops.
87+
88+
- `iree-opt-aggressively-propagate-transposes` (enabled at `O3`)
89+
90+
Enables more transpose propagation by allowing transposes to be propagated
91+
to `linalg` named ops even when the resulting op will be a `linalg.generic`.
92+
1993
## High level program optimizations
2094

2195
### Constant evaluation (`--iree-opt-const-eval` (on))
@@ -56,12 +130,3 @@ and constant data rewritten to lower precision types.
56130

57131
This feature is actively evolving and will be the subject of dedicated
58132
documentation when ready.
59-
60-
### Strip Debug Assertions (`--iree-opt-strip-assertions` (off))
61-
62-
Strips all `std.assert` ops in the input program after useful information for
63-
optimization analysis has been extracted. Assertions provide useful user-visible
64-
error messages but can prevent critical optimizations. Assertions are not,
65-
however, a substitution for control flow and frontends that want to check errors
66-
in optimized release builds should do so via actual code - similar to when one
67-
would `if (foo) return false;` vs. `assert(foo);` in a normal program.

0 commit comments

Comments
 (0)