|
| 1 | +# Frontend Compilation Optimization |
| 2 | + |
| 3 | +Much like classical compilers, AI compilers implement compilation |
| 4 | +optimization to enhance the effectiveness of the IRs generated during |
| 5 | +the compilation process. This strategy reduces not only the length of |
| 6 | +the code and the time required for its compilation and execution but |
| 7 | +also diminishes the energy usage of processors during execution. |
| 8 | +Compilation optimization techniques can be divided into two categories: |
| 9 | +hardware-agnostic optimization and hardware-specific optimization. |
| 10 | +However, all optimization techniques applied at the frontend are |
| 11 | +inherently hardware-agnostic, as the frontend remains oblivious to the |
| 12 | +backend hardware specifics. |
| 13 | + |
| 14 | +## Process of Compilation Optimization |
| 15 | + |
| 16 | +Typically, compilation optimizers execute a sequence of optimization |
| 17 | +passes. In each pass, an IR is used as input, which then produces a |
| 18 | +revised IR as output. A single pass might incorporate several sub-passes |
| 19 | +and can be conducted once or multiple times. |
| 20 | + |
| 21 | +The overall success of compilation optimization significantly depends on |
| 22 | +the selection and ordering of optimization operations. Not only does the |
| 23 | +compiler execute various compilation optimization operations as needed, |
| 24 | +but it can also adjust the number of optimization passes along with the |
| 25 | +types and sequence of optimization operations. These adjustments are |
| 26 | +contingent upon the set level of compilation optimization, as |
| 27 | +illustrated in Figure :numref:`ch06/ch06-opt-pass`. |
| 28 | + |
| 29 | + |
| 30 | +:label:`ch06/ch06-opt-pass` |
| 31 | + |
| 32 | +## Prevalent Optimization Methods |
| 33 | + |
| 34 | +Today, a wide array of frontend compilation optimization methods exist. |
| 35 | +Analogously, machine learning frameworks also employ various |
| 36 | +optimization methods, although these diverge from those found in |
| 37 | +classical compilers. This section will detail three frequently employed |
| 38 | +and versatile frontend compilation optimization methods. |
| 39 | + |
| 40 | +### Elimination of Dead Code and Unreachable Code |
| 41 | + |
| 42 | +Dead code refers to segments of code that yield outputs not utilized by |
| 43 | +any other code, while unreachable code refers to segments of code that |
| 44 | +are not included in any valid control flow path. Figure |
| 45 | +:numref:`ch06/ch06-opt-pass-useless-code0-elimination` |
| 46 | +demonstrates these two types of code. The removal of dead or unreachable |
| 47 | +code can decrease the size of IRs and expedite both the compilation and |
| 48 | +execution of a program. These types of code can result from human errors |
| 49 | +or may manifest during other compilation optimizations. |
| 50 | + |
| 51 | + |
| 52 | +:label:`ch06/ch06-opt-pass-useless-code0-elimination` |
| 53 | + |
| 54 | +In Chapter |
| 55 | +[\[subsec:conversion_between_and_combination_of_dynamic_and_static_graphs\]](#subsec:conversion_between_and_combination_of_dynamic_and_static_graphs){reference-type="ref" |
| 56 | +reference="subsec:conversion_between_and_combination_of_dynamic_and_static_graphs"}, |
| 57 | +it was previously mentioned that the tracing method can be employed |
| 58 | +during the process of converting dynamic graphs to static graphs. The |
| 59 | +tracing method is considered highly effective in identifying dead code |
| 60 | +and unreachable code. Consequently, this step is often incorporated into |
| 61 | +the graph conversion procedure. |
| 62 | + |
| 63 | +### Constant Propagation and Constant Folding |
| 64 | + |
| 65 | +Constant propagation is a process that replaces specific constants with |
| 66 | +their known values during compilation. On the other hand, constant |
| 67 | +folding is a process that substitutes variables with constants when the |
| 68 | +results of multiple operations can be computed directly during |
| 69 | +compilation. |
| 70 | +Figure :numref:`ch06/ch06-opt-pass-constant-broadcast` depicts these two |
| 71 | +methods. |
| 72 | + |
| 73 | + |
| 74 | +:label:`ch06/ch06-opt-pass-constant-broadcast` |
| 75 | + |
| 76 | +### Common Subexpression Elimination |
| 77 | + |
| 78 | +In order to understand what common subexpression elimination entails, |
| 79 | +let's consider the following: If an expression E has been computed and |
| 80 | +the values of all its variables remain unchanged from the prior |
| 81 | +computation, E is identified as a common subexpression. This concept is |
| 82 | +visualized in |
| 83 | +Figure :numref:`ch06/ch06-opt-pass-CSE`. As such, E doesn't need to be |
| 84 | +computed again; it can be directly replaced with the expression result |
| 85 | +obtained from the preceding computation. |
| 86 | + |
| 87 | + |
| 88 | +:label:`ch06/ch06-opt-pass-CSE` |
| 89 | + |
| 90 | +Common subexpression elimination, like the elimination of dead code and |
| 91 | +unreachable code, is typically carried out during the graph conversion |
| 92 | +process. In PyTorch, the torch script module provides a dedicated API |
| 93 | +for common subexpression elimination. This approach is inherent as it |
| 94 | +simplifies the identification of common subexpressions within |
| 95 | +torchscript. |
0 commit comments