|
1 | 1 | --- |
2 | | -title: BOLT overview |
| 2 | +title: Overview |
3 | 3 | weight: 2 |
4 | 4 |
|
5 | 5 | ### FIXED, DO NOT MODIFY |
6 | 6 | layout: learningpathall |
7 | 7 | --- |
8 | 8 |
|
9 | | -[BOLT](https://github.com/llvm/llvm-project/blob/main/bolt/README.md) is a post-link binary optimizer that uses Linux Perf data to re-order the executable code layout to reduce memory overhead and improve performance. |
| 9 | +## What is BOLT? |
10 | 10 |
|
11 | | -Make sure you have [BOLT](/install-guides/bolt/) and [Linux Perf](/install-guides/perf/) installed. |
| 11 | +[BOLT](https://github.com/llvm/llvm-project/blob/main/bolt/README.md) is a post-link binary optimizer that uses uses profiling data from [Linux Perf](/install-guides/perf/) to identify frequently executed functions and basic blocks. Based on this data, BOLT reorders code to improve instruction cache locality, reduce branch mispredictions, and shorten critical execution paths. |
12 | 12 |
|
13 | | -You should use an Arm Linux system with at least 8 CPUs and 16 Gb of RAM. Ubuntu 24.04 is used for testing, but other Linux distributions are possible. |
| 13 | +This often results in faster startup times, lower CPU cycles per instruction (CPI), and improved throughput - especially for large, performance-sensitive applications like databases, web servers, or system daemons. |
14 | 14 |
|
15 | | -## What will I do in this Learning Path? |
| 15 | +{{% notice Note %}} |
| 16 | +BOLT complements compile-time optimizations like LTO (Link-Time Optimization) and PGO (Profile-Guided Optimization). It applies after linking, giving it visibility into the final binary layout, which traditional compiler optimizations do not. |
| 17 | +{{% /notice %}} |
16 | 18 |
|
17 | | -In this Learning Path you learn how to use BOLT to optimize applications and shared libraries. MySQL is used as the application and two share libraries which are used by MySQL are also optimized using BOLT. |
| 19 | +Before you begin, ensure that you have the following installed: |
18 | 20 |
|
19 | | -Here is an outline of the steps: |
| 21 | +- [BOLT](/install-guides/bolt/) |
| 22 | +- [Linux Perf](/install-guides/perf/) |
20 | 23 |
|
21 | | -1. Collect and merge BOLT profiles from multiple workloads, such as read-only and write-only |
| 24 | +You should use an Arm-based Linux system with at least 8 CPUs and 16 GB of RAM. This Learning Path was tested on Ubuntu 24.04, but other Linux distributions are also supported. |
22 | 25 |
|
23 | | - A read-only workload typically involves operations that only retrieve or query data, such as running SELECT statements in a database without modifying any records. In contrast, a write-only workload focuses on operations that modify data, such as INSERT, UPDATE, or DELETE statements. Profiling both types ensures that the optimized binary performs well under different usage patterns. |
| 26 | +## What will I do in this Learning Path? |
24 | 27 |
|
25 | | -2. Independently optimize application binaries and external user-space libraries, such as `libssl.so` and `libcrypto.so` |
| 28 | +In this Learning Path, you'll learn how to use BOLT to optimize both applications and shared libraries. You'll walk through a real-world example using MySQL and two of its dependent libraries: |
26 | 29 |
|
27 | | - This means you can apply BOLT optimizations not just to your main application, but also to shared libraries it depends on, resulting in a more comprehensive performance improvement across your entire stack. |
| 30 | +- `libssl.so` |
| 31 | +- `libcrypto.so` |
28 | 32 |
|
29 | | -3. Merge profile data for broader code coverage |
| 33 | +You will: |
30 | 34 |
|
31 | | - By combining the profile data collected from different workloads and libraries, you create a single, comprehensive profile that represents a wide range of application behaviors. This merged profile allows BOLT to optimize code paths that are exercised under different scenarios, leading to better overall performance and coverage than optimizing for a single workload. |
| 35 | +- **Collect and merge BOLT profiles from multiple workloads, such as read-only and write-only** - a read-only workload typically involves operations that only retrieve or query data, such as running SELECT statements in a database without modifying any records. In contrast, a write-only workload focuses on operations that modify data, such as INSERT, UPDATE, or DELETE statements. Profiling both types ensures that the optimized binary performs well under different usage patterns. |
32 | 36 |
|
33 | | -4. Run BOLT on each binary application and library |
| 37 | +- **Independently optimize application binaries and external user-space libraries, such as `libssl.so` and `libcrypto.so`** - this means that you can apply BOLT optimizations to not just your main application, but also to shared libraries it depends on, resulting in a more comprehensive performance improvement across your entire stack. |
34 | 38 |
|
35 | | - With the merged profile, you apply BOLT optimizations separately to each binary and shared library. This step ensures that both your main application and its dependencies are optimized based on real-world usage patterns, resulting in a more efficient and responsive software stack. |
| 39 | +- **Merge profile data for broader code coverage** - by combining the profile data collected from different workloads and libraries, you create a single, comprehensive profile that represents a wide range of application behaviors. This merged profile allows BOLT to optimize code paths that are exercised under different scenarios, leading to better overall performance and coverage than optimizing for a single workload. |
36 | 40 |
|
37 | | -5. Link the final optimized binary with the separately optimized libraries to deploy a fully optimized runtime stack |
| 41 | +- **Run BOLT on each binary application and library** - with the merged profile, you apply BOLT optimizations separately to each binary and shared library. This step ensures that both your main application and its dependencies are optimized based on real-world usage patterns, resulting in a more efficient and responsive software stack. |
38 | 42 |
|
39 | | - After optimizing each component, you combine them to create a deployment where both the application and its libraries benefit from BOLT's enhancements. |
| 43 | +- **Link the final optimized binary with the separately optimized libraries to deploy a fully optimized runtime stack** - after optimizing each component, you combine them to create a deployment where both the application and its libraries benefit from BOLT's enhancements. |
40 | 44 |
|
41 | 45 | ## What is BOLT profile merging? |
42 | 46 |
|
43 | | -BOLT profile merging is the process of combining profiling from multiple runs into a single profile. This merged profile enables BOLT to optimize binaries for a broader set of real-world behaviors, ensuring that the final optimized application or library performs well across diverse workloads, not just a single use case. By merging profiles, you capture a wider range of code paths and execution patterns, leading to more robust and effective optimizations. |
44 | | - |
45 | | - |
46 | | - |
47 | | -## What are good applications for BOLT? |
48 | | - |
49 | | -MySQL and Sysbench are used as example applications, but you can use this method for any feature-rich application that: |
| 47 | +BOLT profile merging combines profiling data from multiple runs into one unified profile. This merged profile enables BOLT to optimize binaries for a broader set of real-world behaviors, ensuring that the final optimized application or library performs well across diverse workloads, not just a single use case. By merging profiles, you capture a wider range of code paths and execution patterns, leading to more robust and effective optimizations. |
50 | 48 |
|
51 | | -1. Exhibits multiple runtime paths |
| 49 | + |
52 | 50 |
|
53 | | - Applications often have different code paths depending on the workload or user actions. Optimizing for just one path can leave performance gains untapped in others. By profiling and merging data from various workloads, you ensure broader optimization coverage. |
| 51 | +## What types of applications benefit from BOLT? |
54 | 52 |
|
55 | | -2. Uses dynamic libraries |
| 53 | +Although this Learning Path uses MySQL and Sysbench as examples, you can apply the same method to any feature-rich application that: |
56 | 54 |
|
57 | | - Most modern applications rely on shared libraries for functionality. Optimizing these libraries alongside the main binary ensures consistent performance improvements throughout the application. |
| 55 | +- **Exhibits multiple runtime paths** - applications often have different code paths depending on the workload or user actions. Optimizing for just one path can leave performance gains untapped in others. By profiling and merging data from various workloads, you ensure broader optimization coverage. |
58 | 56 |
|
59 | | -3. Requires full-stack binary optimization for performance-critical deployment |
| 57 | +- **Uses dynamic libraries** - most modern applications rely on shared libraries for functionality. + Optimizing shared libraries alongside the main binary ensures consistent performance across your stack. |
60 | 58 |
|
61 | | - In scenarios where every bit of performance matters, such as high-throughput servers or latency-sensitive applications, optimizing the entire binary stack can yield significant benefits. |
| 59 | +- **Requires full-stack binary optimization for performance-critical deployment** - in scenarios where every bit of performance matters, such as high-throughput servers or latency-sensitive applications, optimizing the entire binary stack can yield significant benefits. |
62 | 60 |
|
63 | 61 |
|
0 commit comments