You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/tune-network-workloads-on-bare-metal/1_setup.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ layout: learningpathall
8
8
9
9
## Overview
10
10
11
-
Tomcat is a common client–server web workload that serves HTTP/HTTPS requests. In this section, you will set up a benchmarking environment using Apache Tomcat (server) and `wrk2` (client) to generate load and measure performance on an Arm-based bare‑metal instance. This guide was validated on an AWS `c8g.metal‑48xl` instance running Ubuntu 24.04.
11
+
Tomcat is a common client–server web workload that serves HTTP/HTTPS requests. In this section, you will set up a benchmarking environment using Apache Tomcat (server) and `wrk2` (client) to generate load and measure performance on an Arm-based bare‑metal instance. This Learning Path was validated on an AWS `c8g.metal‑48xl` instance running Ubuntu 24.04.
12
12
13
13
## Set up the Tomcat benchmark server
14
14
@@ -63,7 +63,7 @@ Allowing `.*` permits access from all IP addresses and should be used only in is
63
63
## Start the Tomcat server
64
64
65
65
{{% notice Note %}}
66
-
For maximum performance, ensure the per‑process limit for open file descriptors is high enough.
66
+
For maximum performance, ensure the per‑process limit for open file descriptors is sufficient.
67
67
{{% /notice %}}
68
68
69
69
Start the server:
@@ -106,7 +106,7 @@ Ensure port **8080** is open in the security group or firewall for your Arm‑ba
106
106
[Wrk2](https://github.com/giltene/wrk2) is a high-performance HTTP benchmarking tool specialized in generating constant throughput loads and measuring latency percentiles for web services. `wrk2` is an enhanced version of `wrk` that provides accurate latency statistics under controlled request rates, ideal for performance testing of HTTP servers.
107
107
108
108
{{% notice Note %}}
109
-
Currently, `wrk2` is only supported on **x86_64** machines. Run the client steps below on a bare‑metal x86_64 server running Ubuntu 24.04.
109
+
Currently, `wrk2` is only supported on x86_64 machines. Run the client steps below on a bare‑metal x86_64 server running Ubuntu 24.04.
110
110
{{% /notice %}}
111
111
112
112
## Install dependencies
@@ -140,7 +140,7 @@ sudo cp wrk /usr/local/bin
140
140
As with Tomcat, set a high open‑files limit to avoid hitting FD caps during the run.
141
141
{{% /notice %}}
142
142
143
-
Benchmark the HelloWorld servlet running on Tomcat:
143
+
Benchmark the `HelloWorld` servlet running on Tomcat:
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/tune-network-workloads-on-bare-metal/2_baseline.md
+65-65Lines changed: 65 additions & 65 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ layout: learningpathall
11
11
In this section, you establish a baseline configuration before applying advanced techniques to tune the performance of Tomcat-based network workloads on an Arm Neoverse bare-metal instance.
12
12
13
13
{{% notice Note %}}
14
-
To avoid running out of file descriptors under load, raise the file‑descriptor limit on **both** the server and the client:
14
+
To avoid running out of file descriptors under load, raise the file‑descriptor limit on *both* the server and the client:
15
15
```bash
16
16
ulimit -n 65535
17
17
```
@@ -32,24 +32,24 @@ This baseline includes:
32
32
If you are using a cloud image (for example, AWS) with non-default kernel parameters, align IOMMU settings with the Ubuntu defaults: `iommu.strict=1` and `iommu.passthrough=0`.
33
33
{{% /notice %}}
34
34
35
-
1.Edit GRUB and add (or update) `GRUB_CMDLINE_LINUX`:
35
+
Edit GRUB and add (or update) `GRUB_CMDLINE_LINUX`:
3. Verify that the default settings have been successfully applied:
52
+
Verify that the default settings have been successfully applied:
53
53
```bash
54
54
sudo dmesg | grep iommu
55
55
```
@@ -63,23 +63,23 @@ You should see that under the default configuration, `iommu.strict` is enabled,
63
63
## Establish a baseline on Arm Neoverse bare-metal instances
64
64
65
65
{{% notice Note %}}
66
-
To mirror a typical Tomcat deployment and simplify tuning, keep **8 CPU cores online** and set the remaining cores offline. Adjust the CPU range to match your instance. The example below assumes 192 CPUs (as on AWS `c8g.metal-48xl`).
66
+
To mirror a typical Tomcat deployment and simplify tuning, keep 8 CPU cores online and set the remaining cores offline. Adjust the CPU range to match your instance. The example below assumes 192 CPUs (as on AWS `c8g.metal-48xl`).
@@ -204,24 +204,24 @@ To minimize contention and context switching, align Tomcat’s CPU‑intensive t
204
204
...
205
205
```
206
206
207
-
You’ll typically see **`http-nio-8080-e`** and **`http-nio-8080-P`** threads as CPUintensive. Because the **`http-nio-8080-P`** thread count is fixed at 1 (in current Tomcat releases), and you have 8 online CPU cores, set**`http-nio-8080-e`** to **7**.
207
+
You’ll typically see `http-nio-8080-e` and `http-nio-8080-P` threads as CPU-intensive. Because the `http-nio-8080-P` thread count is fixed at 1 (in current Tomcat releases), and you have 8 online CPU cores, set `http-nio-8080-e` to 7.
208
208
209
-
2. Edit `server.xml` and update the HTTP connector to set the worker thread counts and connection limits:
209
+
Edit `server.xml` and update the HTTP connector to set the worker thread counts and connection limits:
210
210
211
-
```bash
211
+
```bash
212
212
vi ~/apache-tomcat-11.0.10/conf/server.xml
213
-
```
213
+
```
214
214
215
-
Replace the existing connector:
216
-
```xml
215
+
Replace the existing connector:
216
+
```xml
217
217
<!-- Before -->
218
218
<Connectorport="8080"protocol="HTTP/1.1"
219
219
connectionTimeout="20000"
220
220
redirectPort="8443" />
221
-
```
221
+
```
222
222
223
-
With the tuned settings:
224
-
```xml
223
+
With the tuned settings:
224
+
```xml
225
225
<!-- After -->
226
226
<Connectorport="8080"protocol="HTTP/1.1"
227
227
connectionTimeout="20000"
@@ -231,25 +231,25 @@ To minimize contention and context switching, align Tomcat’s CPU‑intensive t
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/tune-network-workloads-on-bare-metal/5_iommu.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ layout: learningpathall
8
8
9
9
## Tune with IOMMU
10
10
11
-
IOMMU (Input–Output Memory Management Unit) controls how I/O devices access memory. In many cloud environments, SmartNICs offload IOMMU-related work. On Arm Neoverse bare‑metal systems, you can often improve Tomcat networking performance by **disabling strict mode** and **enabling passthrough** (setting `iommu.strict=0` and `iommu.passthrough=1`).
11
+
IOMMU (Input–Output Memory Management Unit) controls how I/O devices access memory. In many cloud environments, SmartNICs offload IOMMU-related work. On Arm Neoverse bare‑metal systems, you can often improve Tomcat networking performance by disabling strict mode and enabling passthrough (setting `iommu.strict=0` and `iommu.passthrough=1`).
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/tune-network-workloads-on-bare-metal/_index.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,14 +6,14 @@ minutes_to_complete: 60
6
6
who_is_this_for: This is an advanced topic for engineers who want to tune the performance of network workloads on Arm Neoverse-based bare-metal instances.
7
7
8
8
learning_objectives:
9
-
- Set up a benchmarking environment using Apache Tomcat and wrk2 on an Arm Neoverse bare‑metal host
10
-
- Establish a reproducible baseline performance configuration (throughput and latency) before tuning
11
-
- Tune NIC multi‑queue, RSS/RPS/XPS, and IRQ affinity to increase throughput and stabilize latency
12
-
- Optimize NUMA locality by pinning Tomcat workers and interrupts to local CPUs and memory
13
-
- Evaluate IOMMU configuration options and select the setting that maximizes networking performance
9
+
- Set up Apache Tomcat and wrk2 to benchmark HTTP on an Arm Neoverse bare‑metal host
- Tune NIC queue count to match available cores and measure impact
12
+
- Improve NUMA locality by placing Tomcat on the NIC’s NUMA node and aligning worker threads with cores
13
+
- Compare IOMMU strict mode and IOMMU passthrough mode, and select the configuration that delivers the best performance for your workload
14
14
15
15
prerequisites:
16
-
- An Arm Neoverse-based bare-metal server running Ubuntu 24.04 to run Apache Tomcat (this Learning Path was tested with an AWS c8g.metal-48xl instance)
16
+
- An Arm Neoverse-based bare-metal server running Ubuntu 24.04 to run Apache Tomcat
17
17
- Access to an x86_64 bare-metal server running Ubuntu 24.04 to run `wrk2`
0 commit comments