-
Notifications
You must be signed in to change notification settings - Fork 115
Description
The prototype of the fast tier-1 JIT compiler (#283) introduces an advanced sampling-based profiler and a tier-up optimization strategy. The two fundamental components for this are: 1) a multi-tiered recompilation system that is well-balanced, and 2) a profiling system that is both reliable and has low overhead for enhancing performance.
Initially, all RISC-V instructions are processed by an optimizing interpreter, with each block having a counter for tracking function calls and loop iterations, starting from a preset threshold. The interpreter's detection of a loop's backward branch involves loop iteration count assessment and counter decrement adjustment based on this count. Should the iteration count be substantial, JIT compilation proceeds immediately, bypassing the need for the counter to deplete fully.
______ ____________ ___________ ___________ _________
| | | | | | | | | |
| RV32 | | Control | | LLVM IR | | Optimized | | Machine |
| IR |-->| Flow Graph |-*->| (initial) |-*->| LLVM IR |-*->| code |
|______| |____________| | |___________| | |___________| | |_________|
| | |
| Link | Optimize | JIT compile
______|_______ ___|____ __|____
| | | | | |
| | | LLVM | | LLVM |
| LLVM Bitcode | | Passes | | MCJIT |
|______________| |________| |_______|
The tier-2 optimizing JIT compiler further enhances optimizations, including aggressive block inlining, an expanded range of dataflow optimizations, and superior code quality through an additional code generation pass. These optimizations span various techniques like copy propagation, array bound check elimination, and dead code elimination. Additional optimizations at this level encompass escape analysis, code scheduling, and DAG-based optimizations like loop versioning and pre-pass code scheduling, with an increased maximum iteration count for dataflow-based optimizations. This sophisticated approach ensures that blocks needing the most optimization receive it timely and effectively, leveraging a tiered system for optimal performance.
Quote from Design and Evaluation of Dynamic Optimizations for a Java Just-In-Time Compiler:
- Our system employs a dataflow-based intra- method region selection algorithm that uses both static heuristics and dynamic profiles, and then integrates the algorithm into the inlining process to extract effective inter- procedural regions. Our empirical evaluation demonstrates that our system can improve the performance and reduce the compilation overhead significantly.
- The third IR is called a directed acyclic graph (DAG). This is also a register-based representation and is in the form of static single assignment (SSA). ... This simply indicates how the DAG representation looks and ignores all of the exception checks necessary for the array accesses within the loop. The actual DAG includes the nodes for exception checking instructions and the edges representing exception dependencies.
- To minimize the bottom-line overhead, this profiler focuses only on detecting hot methods and does not collect other information such as the call context of the given method. Instead, additional information is collected by a different profiler for only selected hot methods.
The development of the Tier-2 JIT compiler will utilize LLVM, specifically leveraging the ORC lazy JIT feature provided by LLVM. This approach will harness LLVM's capabilities for efficient and dynamic JIT compilation. Sample code:
Reference: