Skip to content

Commit 9ef37c4

Browse files
authored
Merge branch 'ArmDeveloperEcosystem:main' into main
2 parents 6595e84 + cbb1a49 commit 9ef37c4

File tree

16 files changed

+324
-189
lines changed

16 files changed

+324
-189
lines changed

content/learning-paths/servers-and-cloud-computing/bolt-merge/_index.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
11
---
22
title: Optimize Arm applications and shared libraries with BOLT
33

4-
draft: true
5-
cascade:
6-
draft: true
7-
84
minutes_to_complete: 30
95

106
who_is_this_for: Performance engineers and software developers working on Arm platforms who want to optimize both application binaries and shared libraries using BOLT.
109 KB
Loading

content/learning-paths/servers-and-cloud-computing/bolt-merge/how-to-1.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,13 @@ layout: learningpathall
1010

1111
Make sure you have [BOLT](/install-guides/bolt/) and [Linux Perf](/install-guides/perf/) installed.
1212

13-
You should use an Arm Linux system with at least 4 CPUs and 16 Gb of RAM. Ubuntu 24.04 is used for testing, but other Linux distributions are possible.
13+
You should use an Arm Linux system with at least 8 CPUs and 16 Gb of RAM. Ubuntu 24.04 is used for testing, but other Linux distributions are possible.
1414

1515
## What will I do in this Learning Path?
1616

17-
In this Learning Path you learn how to use BOLT to optimize applications and shared libraries. MySQL is used as the applcation and two share libraries which are used by MySQL are also optimized using BOLT.
17+
In this Learning Path you learn how to use BOLT to optimize applications and shared libraries. MySQL is used as the application and two share libraries which are used by MySQL are also optimized using BOLT.
18+
19+
Here is an outline of the steps:
1820

1921
1. Collect and merge BOLT profiles from multiple workloads, such as read-only and write-only
2022

@@ -36,18 +38,23 @@ In this Learning Path you learn how to use BOLT to optimize applications and sha
3638

3739
After optimizing each component, you combine them to create a deployment where both the application and its libraries benefit from BOLT's enhancements.
3840

41+
## What is BOLT profile merging?
42+
43+
BOLT profile merging is the process of combining profiling from multiple runs into a single profile. This merged profile enables BOLT to optimize binaries for a broader set of real-world behaviors, ensuring that the final optimized application or library performs well across diverse workloads, not just a single use case. By merging profiles, you capture a wider range of code paths and execution patterns, leading to more robust and effective optimizations.
44+
45+
![Why BOLT Profile Merging?](Bolt-merge.png)
3946

4047
## What are good applications for BOLT?
4148

42-
MySQL and sysbench are used as example applications, but you can use this method for **any feature-rich application** that:
49+
MySQL and Sysbench are used as example applications, but you can use this method for any feature-rich application that:
4350

4451
1. Exhibits multiple runtime paths
4552

4653
Applications often have different code paths depending on the workload or user actions. Optimizing for just one path can leave performance gains untapped in others. By profiling and merging data from various workloads, you ensure broader optimization coverage.
4754

4855
2. Uses dynamic libraries
4956

50-
Many modern applications rely on shared libraries for functionality. Optimizing these libraries alongside the main binary ensures consistent performance improvements throughout the application.
57+
Most modern applications rely on shared libraries for functionality. Optimizing these libraries alongside the main binary ensures consistent performance improvements throughout the application.
5158

5259
3. Requires full-stack binary optimization for performance-critical deployment
5360

content/learning-paths/servers-and-cloud-computing/bolt-merge/how-to-2.md

Lines changed: 137 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: BOLT Optimization - First feature
2+
title: Instrument MySQL with BOLT
33
weight: 3
44

55
### FIXED, DO NOT MODIFY
@@ -10,33 +10,92 @@ In this step, you will use BOLT to instrument the MySQL application binary and t
1010

1111
The collected profiles will be merged with others and used to optimize the application's code layout.
1212

13-
### Build the uninstrumented binary
13+
## Build mysqld from source
1414

15-
Make sure your application binary is:
15+
Follow these steps to build the MySQL server (`mysqld`) from source:
1616

17-
- Built from source (e.g., `mysqld`)
17+
Install the required dependencies:
18+
19+
```bash
20+
sudo apt update
21+
sudo apt install -y build-essential cmake libncurses5-dev libssl-dev libboost-all-dev bison pkg-config libaio-dev libtirpc-dev git
22+
```
23+
24+
Download the MySQL source code. You can change to another version in the `checkout` command below if needed.
25+
26+
```bash
27+
git clone https://github.com/mysql/mysql-server.git
28+
cd mysql-server
29+
git checkout mysql-8.4.5
30+
```
31+
32+
Configure the build for debug:
33+
34+
```bash
35+
mkdir build && cd build
36+
cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_DEBUG=1 -DCMAKE_C_FLAGS="-fno-omit-frame-pointer" \
37+
-DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer" -DCMAKE_POSITION_INDEPENDENT_CODE=OFF \
38+
-DCMAKE_EXE_LINKER_FLAGS="-Wl,--emit-relocs" \
39+
-DCMAKE_EXE_LINKER_FLAGS="-no-pie"
40+
```
41+
42+
Build mysqld:
43+
44+
```bash
45+
make -j$(nproc)
46+
```
47+
48+
After the build completes, the `mysqld` binary is located at `$HOME/mysql-server/build/runtime_output_directory/mysqld`
49+
50+
{{% notice Note %}}
51+
You can run `mysqld` directly from the build directory as shown, or run `make install` to install it system-wide. For testing and instrumentation, running from the build directory is usually preferred.
52+
{{% /notice %}}
53+
54+
After building mysqld, install MySQL server and client utilities system-wide:
55+
56+
```bash
57+
sudo make install
58+
```
59+
60+
This will make the `mysql` client and other utilities available in your PATH.
61+
62+
Ensure the binary is unstripped and includes debug symbols for BOLT instrumentation.
63+
64+
To work with BOLT, your application binary should be:
65+
66+
- Built from source
1867
- Unstripped, with symbol information available
1968
- Compiled with frame pointers enabled (`-fno-omit-frame-pointer`)
2069

2170
You can verify this with:
2271

2372
```bash
24-
readelf -s /path/to/mysqld | grep main
73+
readelf -s $HOME/mysql-server/build/runtime_output_directory/mysqld | grep main
74+
```
75+
76+
The partial output is:
77+
78+
```output
79+
23837: 000000000950dfe8 8 OBJECT GLOBAL DEFAULT 27 mysql_main
80+
34522: 000000000915bfd0 8 OBJECT GLOBAL DEFAULT 26 server_main_callback
81+
42773: 00000000051730e4 80 FUNC GLOBAL DEFAULT 13 _Z18my_main_thre[...]
82+
44882: 000000000357dc98 40 FUNC GLOBAL DEFAULT 13 main
83+
61046: 0000000005ffd5c0 40 FUNC GLOBAL DEFAULT 13 _Z21record_main_[...]
2584
```
2685

2786
If the symbols are missing, rebuild the binary with debug info and no stripping.
2887

29-
### Step 2: Instrument the binary with BOLT
88+
## Instrument the binary with BOLT
3089

3190
Use `llvm-bolt` to create an instrumented version of the binary:
3291

3392
```bash
34-
llvm-bolt /path/to/mysqld \\
35-
-instrument \\
36-
-o /path/to/mysqld.instrumented \\
37-
--instrumentation-file=/path/to/profile-readonly.fdata \\
38-
--instrumentation-sleep-time=5 \\
39-
--instrumentation-no-counters-clear \\
93+
llvm-bolt $HOME/mysql-server/build/runtime_output_directory/mysqld \
94+
-instrument \
95+
-o $HOME/mysql-server/build/runtime_output_directory/mysqld.instrumented \
96+
--instrumentation-file=$HOME/mysql-server/build/profile-readonly.fdata \
97+
--instrumentation-sleep-time=5 \
98+
--instrumentation-no-counters-clear \
4099
--instrumentation-wait-forks
41100
```
42101

@@ -46,38 +105,86 @@ llvm-bolt /path/to/mysqld \\
46105
- `--instrumentation-file`: Path where the profile output will be saved
47106
- `--instrumentation-wait-forks`: Ensures the instrumentation continues through forks (important for daemon processes)
48107

49-
---
50108

51-
### Step 3: Run the instrumented binary under a feature-specific workload
109+
## Start the instrumented MySQL server
110+
111+
Before running the workload, start the instrumented MySQL server in a separate terminal. You may need to initialize a new data directory if this is your first run:
112+
113+
```bash
114+
# Initialize a new data directory (if needed)
115+
$HOME/mysql-server/build/runtime_output_directory/mysqld.instrumented --initialize-insecure --datadir=$HOME/mysql-bolt-data
116+
117+
# Start the instrumented server
118+
# On an 8-core system, use available cores (e.g., 6 for mysqld, 7 for sysbench)
119+
taskset -c 6 $HOME/mysql-server/build/runtime_output_directory/mysqld.instrumented \
120+
--datadir=$HOME/mysql-bolt-data \
121+
--socket=$HOME/mysql-bolt.sock \
122+
--port=3306 \
123+
--user=$(whoami) &
124+
```
125+
126+
Adjust `--datadir`, `--socket`, and `--port` as needed for your environment. Make sure the server is running and accessible before proceeding.
127+
128+
## Install sysbench
129+
130+
You will need sysbench to generate workloads for MySQL. On most Arm Linux distributions, you can install it using your package manager:
131+
132+
```bash
133+
sudo apt update
134+
sudo apt install -y sysbench
135+
```
136+
137+
Alternatively, see the [sysbench GitHub page](https://github.com/akopytov/sysbench) for build-from-source instructions if a package is not available for your platform.
138+
139+
## Create a test database and user
140+
141+
For sysbench to work, you need a test database and user. Connect to the MySQL server as the root user (or another admin user) and run:
142+
143+
```bash
144+
mysql -u root --socket=$HOME/mysql-bolt.sock
145+
```
146+
147+
Then, in the MySQL shell:
148+
149+
```sql
150+
CREATE DATABASE IF NOT EXISTS bench;
151+
CREATE USER IF NOT EXISTS 'bench'@'localhost' IDENTIFIED BY 'bench';
152+
GRANT ALL PRIVILEGES ON bench.* TO 'bench'@'localhost';
153+
FLUSH PRIVILEGES;
154+
EXIT;
155+
```
156+
157+
## Run the instrumented binary under a feature-specific workload
52158

53159
Use a workload generator to stress the binary in a feature-specific way. For example, to simulate **read-only traffic** with sysbench:
54160

55161
```bash
56-
taskset -c 9 ./src/sysbench \\
57-
--db-driver=mysql \\
58-
--mysql-host=127.0.0.1 \\
59-
--mysql-db=bench \\
60-
--mysql-user=bench \\
61-
--mysql-password=bench \\
62-
--mysql-port=3306 \\
63-
--tables=8 \\
64-
--table-size=10000 \\
65-
--threads=1 \\
66-
src/lua/oltp_read_only.lua run
162+
taskset -c 7 sysbench \
163+
--db-driver=mysql \
164+
--mysql-host=127.0.0.1 \
165+
--mysql-db=bench \
166+
--mysql-user=bench \
167+
--mysql-password=bench \
168+
--mysql-port=3306 \
169+
--tables=8 \
170+
--table-size=10000 \
171+
--threads=1 \
172+
/usr/share/sysbench/oltp_read_only.lua run
67173
```
68174

69-
> Adjust this command as needed for your workload and CPU/core binding.
175+
{{% notice Note %}}
176+
On an 8-core system, cores are numbered 0-7. Adjust the `taskset -c` values as needed for your system. Avoid using the same core for both mysqld and sysbench to reduce contention.
177+
{{% /notice %}}
70178

71-
The `.fdata` file defined in `--instrumentation-file` will be populated with runtime execution data.
72179

73-
---
180+
The `.fdata` file defined in `--instrumentation-file` will be populated with runtime execution data.
74181

75-
### Step 4: Verify the profile was created
182+
## Verify the profile was created
76183

77184
After running the workload:
78185

79186
```bash
80-
ls -lh /path/to/profile-readonly.fdata
187+
ls -lh $HOME/mysql-server/build/profile-readonly.fdata
81188
```
82189

83190
You should see a non-empty file. This file will later be merged with other profiles (e.g., for write-only traffic) to generate a complete merged profile.

content/learning-paths/servers-and-cloud-computing/bolt-merge/how-to-3.md

Lines changed: 25 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,53 @@
11
---
2-
title: BOLT Optimization - Second Feature & BOLT Merge to combine
2+
title: Run a new workload using BOLT and merge the results
33
weight: 4
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
In this step, you'll collect profile data for a **write-heavy** workload and also **instrument external libraries** such as `libcrypto.so` and `libssl.so` used by the application (e.g., MySQL).
9+
Next, you will collect profile data for a **write-heavy** workload and merge the results with the **read-heavy** workload in the previous section.
1010

11-
12-
### Step 1: Run Write-Only Workload for Application Binary
11+
## Run Write-Only Workload for Application Binary
1312

1413
Use the same BOLT-instrumented MySQL binary and drive it with a write-only workload to capture `profile-writeonly.fdata`:
1514

1615
```bash
17-
taskset -c 9 ./src/sysbench \\
18-
--db-driver=mysql \\
19-
--mysql-host=127.0.0.1 \\
20-
--mysql-db=bench \\
21-
--mysql-user=bench \\
22-
--mysql-password=bench \\
23-
--mysql-port=3306 \\
24-
--tables=8 \\
25-
--table-size=10000 \\
26-
--threads=1 \\
27-
src/lua/oltp_write_only.lua run
16+
# On an 8-core system, use available cores (e.g., 7 for sysbench)
17+
taskset -c 7 sysbench \
18+
--db-driver=mysql \
19+
--mysql-host=127.0.0.1 \
20+
--mysql-db=bench \
21+
--mysql-user=bench \
22+
--mysql-password=bench \
23+
--mysql-port=3306 \
24+
--tables=8 \
25+
--table-size=10000 \
26+
--threads=1 \
27+
/usr/share/sysbench/oltp_write_only.lua run
2828
```
2929

3030
Make sure that the `--instrumentation-file` is set appropriately to save `profile-writeonly.fdata`.
31-
---
32-
### Step 2: Verify the Second Profile Was Generated
31+
32+
33+
### Verify the Second Profile Was Generated
3334

3435
```bash
35-
ls -lh /path/to/profile-writeonly.fdata
36+
ls -lh $HOME/mysql-server/build/profile-writeonly.fdata
3637
```
3738

3839
Both `.fdata` files should now exist and contain valid data:
3940

4041
- `profile-readonly.fdata`
4142
- `profile-writeonly.fdata`
4243

43-
---
44-
45-
### Step 3: Merge the Feature Profiles
44+
### Merge the Feature Profiles
4645

4746
Use `merge-fdata` to combine the feature-specific profiles into one comprehensive `.fdata` file:
4847

4948
```bash
50-
merge-fdata /path/to/profile-readonly.fdata /path/to/profile-writeonly.fdata \\
51-
-o /path/to/profile-merged.fdata
49+
merge-fdata $HOME/mysql-server/build/profile-readonly.fdata $HOME/mysql-server/build/profile-writeonly.fdata \
50+
-o $HOME/mysql-server/build/profile-merged.fdata
5251
```
5352

5453
**Example command from an actual setup:**
@@ -67,18 +66,15 @@ Profile from 2 files merged.
6766

6867
This creates a single merged profile (`profile-merged.fdata`) covering both read-only and write-only workload behaviors.
6968

70-
---
71-
72-
### Step 4: Verify the Merged Profile
69+
### Verify the Merged Profile
7370

7471
Check the merged `.fdata` file:
7572

7673
```bash
77-
ls -lh /path/to/profile-merged.fdata
74+
ls -lh $HOME/mysql-server/build/profile-merged.fdata
7875
```
7976

80-
---
81-
### Step 5: Generate the Final Binary with the Merged Profile
77+
### Generate the Final Binary with the Merged Profile
8278

8379
Use LLVM-BOLT to generate the final optimized binary using the merged `.fdata` file:
8480

0 commit comments

Comments
 (0)