Skip to content

Commit 4f3cacd

Browse files
Merge pull request #2396 from madeline-underwood/review
IRQ_JA to sign off
2 parents 4922add + 7916acd commit 4f3cacd

File tree

5 files changed

+85
-86
lines changed

5 files changed

+85
-86
lines changed

content/learning-paths/servers-and-cloud-computing/irq-tuning-guide/_index.md

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,16 @@
11
---
2-
title: Learn about the impact of network interrupts on cloud workloads
2+
title: Optimize network interrupt handling on Arm servers
33

4-
draft: true
5-
cascade:
6-
draft: true
7-
4+
85
minutes_to_complete: 20
96

10-
who_is_this_for: This is a specialized topic for developers and performance engineers who are interested in understanding how network interrupt patterns can impact performance on cloud servers.
7+
who_is_this_for: This is an introductory topic for developers and performance engineers who are interested in understanding how network interrupt patterns can impact performance on cloud servers.
118

129
learning_objectives:
1310
- Analyze the current interrupt request (IRQ) layout on an Arm Linux system
1411
- Experiment with different interrupt options and patterns to improve performance
12+
- Configure optimal IRQ distribution strategies for your workload
13+
- Implement persistent IRQ management solutions
1514

1615
prerequisites:
1716
- An Arm computer running Linux
@@ -36,6 +35,22 @@ further_reading:
3635
title: Perf for Linux on Arm (LinuxPerf)
3736
link: https://learn.arm.com/install-guides/perf/
3837
type: website
38+
- resource:
39+
title: Tune network workloads on Arm-based bare-metal instances
40+
link: /learning-paths/servers-and-cloud-computing/tune-network-workloads-on-bare-metal/
41+
type: learning-path
42+
- resource:
43+
title: Get started with Arm-based cloud instances
44+
link: /learning-paths/servers-and-cloud-computing/csp/
45+
type: learning-path
46+
- resource:
47+
title: Linux kernel IRQ subsystem documentation
48+
link: https://www.kernel.org/doc/html/latest/core-api/irq/index.html
49+
type: website
50+
- resource:
51+
title: Microbenchmark and tune network performance with iPerf3
52+
link: /learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/
53+
type: learning-path
3954

4055
### FIXED, DO NOT MODIFY
4156
# ================================================================================

content/learning-paths/servers-and-cloud-computing/irq-tuning-guide/_next-steps.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,5 @@ weight: 21 # Set to always be larger than the content in this p
66
title: "Next Steps" # Always the same, html page title.
77
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
88
---
9+
10+
Lines changed: 20 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Understand and Analyze network IRQ configuration
2+
title: Understand and analyze network IRQ configuration
33
weight: 2
44

55
### FIXED, DO NOT MODIFY
@@ -10,18 +10,18 @@ layout: learningpathall
1010

1111
In modern cloud environments, network performance is critical to overall system efficiency. Network interface cards (NICs) generate interrupt requests (IRQs) to notify the CPU when data packets arrive or need to be sent. These interrupts temporarily pause normal processing, allowing the system to handle network traffic.
1212

13-
By default, Linux distributes these network interrupts across available CPU cores. However, this distribution is not always optimal for performance:
13+
By default, Linux distributes these network interrupts across available CPU cores. However, this distribution is not always optimal for performance, for the following reasons:
1414

15-
- High interrupt rates: In busy servers, network cards can generate thousands of interrupts per second
16-
- CPU cache locality: Processing related network operations on the same CPU core improves cache efficiency
17-
- Resource contention: When network IRQs compete with application workloads for the same CPU resources, both can suffer
15+
- High interrupt rates: in busy servers, network cards can generate thousands of interrupts per second
16+
- CPU cache locality: processing related network operations on the same CPU core improves cache efficiency
17+
- Resource contention: when network IRQs compete with application workloads for the same CPU resources, both can suffer
1818
- Power efficiency: IRQ management can help reduce unnecessary CPU wake-ups, improving energy efficiency
1919

2020
Understanding and optimizing IRQ assignment allows you to balance network processing loads, reduce latency, and maximize throughput for your specific workloads.
2121

2222
## Identifying IRQs on your system
2323

24-
To get started, run this command to display all IRQs on your system and their CPU assignments:
24+
To get started, display all IRQs on your system and their CPU assignments:
2525

2626
```bash
2727
grep '' /proc/irq/*/smp_affinity_list | while IFS=: read path cpus; do
@@ -31,7 +31,7 @@ grep '' /proc/irq/*/smp_affinity_list | while IFS=: read path cpus; do
3131
done
3232
```
3333

34-
The output is very long and looks similar to:
34+
The output is long and looks similar to:
3535

3636
```output
3737
IRQ 104 -> CPUs 12 -> Device ens34-Tx-Rx-5
@@ -50,36 +50,25 @@ IRQ 26 -> CPUs 0-15 -> Device ACPI:Ged
5050

5151
## How to identify network IRQs
5252

53-
Network-related IRQs can be identified by looking at the "Device" column in the output.
53+
Network-related IRQs can be identified by looking at the **Device** column in the output.
5454

5555
You can identify network interfaces using the command:
5656

5757
```bash
5858
ip link show
5959
```
6060

61-
Here are some common patterns to look for:
61+
Look for common interface naming patterns in the output. Traditional ethernet interfaces use names like `eth0`, while wireless interfaces typically appear as `wlan0`. Modern Linux systems often use the predictable naming scheme, which creates names like `enP3p3s0f0` and `ens5-Tx-Rx-0`.
6262

63-
Common interface naming patterns include `eth0` for traditional ethernet, `enP3p3s0f0` and `ens5-Tx-Rx-0` for the Linux predictable naming scheme, or `wlan0` for wireless.
64-
65-
The predictable naming scheme breaks down into:
66-
67-
- en = ethernet
68-
- P3 = PCI domain 3
69-
- p3 = PCI bus 3
70-
- s0 = PCI slot 0
71-
- f0 = function 0
72-
73-
This naming convention helps ensure network interfaces have consistent names across reboots by encoding their physical
74-
location in the system.
63+
The predictable naming scheme encodes the physical location within the interface name. For example, `enP3p3s0f0` breaks down as: `en` for ethernet, `P3` for PCI domain 3, `p3` for PCI bus 3, `s0` for PCI slot 0, and `f0` for function 0. This naming convention helps ensure network interfaces maintain consistent names across reboots by encoding their physical location in the system.
7564

7665
## Improve performance
7766

78-
Once you've identified the network IRQs, you can adjust their CPU assignments to try to improve performance.
67+
Once you've identified the network IRQs, you can adjust their CPU assignments to improve performance.
7968

8069
Identify the NIC (Network Interface Card) IRQs and adjust the system by experimenting and seeing if performance improves.
8170

82-
You may notice that some NIC IRQs are assigned to the same CPU cores by default, creating duplicate assignments.
71+
You might notice that some NIC IRQs are assigned to the same CPU cores by default, creating duplicate assignments.
8372

8473
For example:
8574

@@ -95,13 +84,13 @@ IRQ 106 -> CPUs 10 -> Device ens34-Tx-Rx-7
9584

9685
## Understanding IRQ performance impact
9786

98-
When network IRQs are assigned to the same CPU cores (as shown in the example above where IRQ 101 and 104 both use CPU 12), this can potentially hurt performance as multiple interrupts compete for the same CPU core's attention, while other cores remain underutilized.
87+
When network IRQs are assigned to the same CPU cores (as shown in the example above where IRQ 101 and 104 both use CPU 12), this can potentially degrade performance as multiple interrupts compete for the same resources, while other cores remain underutilized.
9988

10089
By optimizing IRQ distribution, you can achieve more balanced processing and improved throughput. This optimization is especially important for high-traffic servers where network performance is critical.
10190

102-
Suggested experiments are covered in the next section.
91+
{{% notice Note%}} There are suggestions for experiments in the next section. {{% /notice %}}
10392

104-
### How can I reset my IRQs if I make performance worse?
93+
## How can I reset my IRQs if I worsen performance?
10594

10695
If your experiments reduce performance, you can return the IRQs back to default using the following commands:
10796

@@ -110,12 +99,14 @@ sudo systemctl unmask irqbalance
11099
sudo systemctl enable --now irqbalance
111100
```
112101

113-
If needed, install `irqbalance` on your system. For Debian based systems run:
102+
If needed, install `irqbalance` on your system.
103+
104+
For Debian based systems run:
114105

115106
```bash
116107
sudo apt install irqbalance
117108
```
118109

119-
### Saving these changes
110+
## Saving the changes
120111

121-
Any changes you make to IRQs will be reset at reboot. You will need to change your system's settings to make your changes permanent.
112+
Any changes you make to IRQs are reset at reboot. You will need to change your system's settings to make your changes permanent.

content/learning-paths/servers-and-cloud-computing/irq-tuning-guide/conclusion.md

Lines changed: 23 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -8,44 +8,41 @@ layout: learningpathall
88

99
## Optimal IRQ Management Strategies
1010

11-
Testing across multiple cloud platforms reveals that IRQ management effectiveness varies significantly based on system size and workload characteristics. No single pattern works optimally for all scenarios, but clear patterns emerged during performance testing under heavy network loads.
11+
Performance testing across multiple cloud platforms shows that IRQ management effectiveness depends heavily on system size and workload characteristics. While no single approach works optimally in all scenarios, clear patterns emerged during testing under heavy network loads.
1212

13-
## Recommendations by system size
13+
## Recommendations for systems with 16 vCPUs or less
1414

15-
### Systems with 16 vCPUs or less
15+
For smaller systems with 16 or fewer vCPUs, different strategies prove more effective:
1616

17-
For smaller systems with 16 or less vCPUs, concentrated IRQ assignment may provide measurable performance improvements.
17+
- Concentrate network IRQs on just one or two CPU cores rather than spreading them across all available cores.
18+
- Use the `smp_affinity` range assignment pattern with a limited core range (example: `0-1`).
19+
- This approach works best when the number of NIC IRQs exceeds the number of available vCPUs.
20+
- Focus on high-throughput network workloads where concentrated IRQ handling delivers the most significant performance improvements.
1821

19-
- Assign all network IRQs to just one or two CPU cores
20-
- This approach showed the most significant performance gains
21-
- Most effective when the number of NIC IRQs exceeds the number of vCPUs
22-
- Use the `smp_affinity` range assignment pattern from the previous section with a very limited core range, for example `0-1`
22+
Performance improves significantly when network IRQs are concentrated rather than dispersed across all available cores on smaller systems. This concentration reduces context switching overhead and improves cache locality for interrupt handling.
2323

24-
Performance improves significantly when network IRQs are concentrated rather than dispersed across all available cores on smaller systems.
24+
## Recommendations for systems with more than 16 vCPUs
2525

26-
### Systems with more than 16 vCPUs
26+
For larger systems with more than 16 vCPUs, different strategies prove more effective:
2727

28-
For larger systems with more than 16 vCPUs, the findings are different:
28+
- Default IRQ distribution typically delivers good performance.
29+
- Focus on preventing multiple network IRQs from sharing the same CPU core.
30+
- Use the diagnostic scripts from the previous section to identify and resolve overlapping IRQ assignments.
31+
- Apply the paired core pattern to ensure balanced distribution across the system.
2932

30-
- Default IRQ distribution generally performs well
31-
- The primary concern is avoiding duplicate core assignments for network IRQs
32-
- Use the scripts from the previous section to check and correct any overlapping IRQ assignments
33-
- The paired core pattern can help ensure optimal distribution on these larger systems
33+
On larger systems, interrupt handling overhead becomes less significant relative to total processing capacity. The primary performance issue occurs when high-frequency network interrupts compete for the same core, creating bottlenecks.
3434

35-
On larger systems, the overhead of interrupt handling is proportionally smaller compared to the available processing power. The main performance bottleneck occurs when multiple high-frequency network interrupts compete for the same core.
35+
## Implementation considerations
3636

37-
## Implementation Considerations
37+
When implementing these IRQ management strategies, several factors influence your success:
3838

39-
When implementing these IRQ management strategies, there are some important points to keep in mind.
39+
- Consider your workload type first, as CPU-bound applications can benefit from different IRQ patterns than I/O-bound applications. Always benchmark your specific workload with different IRQ patterns rather than assuming one approach works universally.
40+
- For real-time monitoring, use `watch -n1 'grep . /proc/interrupts'` to observe IRQ distribution as it happens. This helps you verify your changes are working as expected.
41+
- On multi-socket systems, NUMA effects become important. Keep IRQs on cores close to the PCIe devices generating them to minimize cross-node memory access latency. Additionally, ensure your IRQ affinity settings persist across reboots by adding them to `/etc/rc.local` or creating a systemd service file.
4042

41-
Pay attention to the workload type. CPU-bound applications may benefit from different IRQ patterns than I/O-bound applications.
43+
As workloads and hardware evolve, revisiting and adjusting IRQ management strategies might be necessary to maintain optimal performance. What works well today might need refinement as your application scales or changes.
4244

43-
Always benchmark your specific workload with different IRQ patterns.
45+
## Next Steps
4446

45-
Monitor IRQ counts in real-time using `watch -n1 'grep . /proc/interrupts'` to observe IRQ distribution in real-time.
47+
You have successfully learned how to optimize network interrupt handling on Arm servers. You can now analyze IRQ distributions, implement different management patterns, and configure persistent solutions for your workloads.
4648

47-
Also consider NUMA effects on multi-socket systems. Keep IRQs on cores close to the PCIe devices generating them to minimize cross-node memory access.
48-
49-
Make sure to set up IRQ affinity settings in `/etc/rc.local` or a systemd service file to ensure they persist across reboots.
50-
51-
Remember that as workloads and hardware evolve, revisiting and adjusting IRQ management strategies may be necessary to maintain optimal performance.

content/learning-paths/servers-and-cloud-computing/irq-tuning-guide/patterns.md

Lines changed: 19 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -12,28 +12,26 @@ Different IRQ management patterns can significantly impact network performance a
1212

1313
Network interrupt requests (IRQs) can be distributed across CPU cores in various ways, each with potential benefits depending on your workload characteristics and system configuration. By strategically assigning network IRQs to specific cores, you can improve cache locality, reduce contention, and potentially boost overall system performance.
1414

15-
The following patterns have been tested on various systems and can be implemented using the provided scripts. An optimal pattern is suggested at the conclusion of this Learning Path, but your specific workload may benefit from a different approach.
15+
The following patterns have been tested on various systems and can be implemented using the provided scripts. An optimal pattern is suggested at the conclusion of this Learning Path, but your specific workload might benefit from a different approach.
1616

17-
### Patterns
17+
## Common IRQ distribution patterns
1818

19-
1. Default: IRQ pattern provided at boot.
20-
2. Random: All IRQs are assigned a core and do not overlap with network IRQs.
21-
3. Housekeeping: All IRQs outside of network IRQs are assigned to specific core(s).
22-
4. NIC IRQs are assigned to single or multiple ranges of cores, including pairs.
19+
Four main distribution strategies offer different performance characteristics:
2320

24-
### Scripts to change IRQ
21+
- Default: uses the IRQ pattern provided at boot time by the Linux kernel
22+
- Random: assigns all IRQs to cores without overlap with network IRQs
23+
- Housekeeping: assigns all non-network IRQs to specific dedicated cores
24+
- NIC-focused: assigns network IRQs to single or multiple ranges of cores, including pairs
2525

26-
The scripts below demonstrate how to implement different IRQ management patterns on your system. Each script targets a specific distribution strategy:
26+
## Scripts to implement IRQ management patterns
2727

28-
Before running these scripts, identify your network interface name using `ip link show` and determine your system's CPU topology with `lscpu`. Always test these changes in a non-production environment first, as improper IRQ assignment can impact system stability.
28+
The scripts below demonstrate how to implement different IRQ management patterns on your system. Each script targets a specific distribution strategy. Before running these scripts, identify your network interface name using `ip link show` and determine your system's CPU topology with `lscpu`. Always test these changes in a non-production environment first, as improper IRQ assignment can impact system stability.
2929

30-
To change the NIC IRQs or IRQs in general you can use the following scripts.
30+
## Housekeeping pattern
3131

32-
### Housekeeping
32+
The housekeeping pattern isolates non-network IRQs to dedicated cores, reducing interference with your primary workloads.
3333

34-
The housekeeping pattern isolates non-network IRQs to dedicated cores.
35-
36-
You need to add more to account for other IRQs on your system.
34+
Replace `#core range here` with your desired CPU range (for example: "0,3"):
3735

3836
```bash
3937
HOUSEKEEP=#core range here (example: "0,3")
@@ -43,13 +41,11 @@ for irq in $(awk '/ACPI:Ged/ {sub(":","",$1); print $1}' /proc/interrupts); do
4341
done
4442
```
4543

46-
### Paired core
47-
48-
The paired core assignment pattern distributes network IRQs across CPU core pairs for better cache coherency.
44+
## Paired core pattern
4945

50-
This is for pairs on a 16 vCPU machine.
46+
The paired core assignment pattern distributes network IRQs across CPU core pairs for better cache coherency.
5147

52-
You need to add the interface name.
48+
This example works for a 16 vCPU machine. Replace `#interface name` with your network interface (for example: "ens5"):
5349

5450
```bash
5551
IFACE=#interface name (example: "ens5")
@@ -68,13 +64,11 @@ for irq in "${irqs[@]}"; do
6864
done
6965
```
7066

71-
### Range assignment
72-
73-
The range assignment pattern assigns network IRQs to a specific range of cores.
67+
## Range assignment pattern
7468

75-
This will assign a specific core(s) to NIC IRQs only.
69+
The range assignment pattern assigns network IRQs to a specific range of cores, providing dedicated network processing capacity.
7670

77-
You need to add the interface name.
71+
Replace `#interface name` with your network interface (for example: "ens5"):
7872

7973
```bash
8074
IFACE=#interface name (example: "ens5")
@@ -84,6 +78,6 @@ for irq in $(awk '/'$IFACE'/ {sub(":","",$1); print $1}' /proc/interrupts); do
8478
done
8579
```
8680

87-
Each pattern offers different performance characteristics depending on your workload. The housekeeping pattern reduces system noise, paired cores optimize cache usage, and range assignment provides dedicated network processing capacity. Test these patterns in your environment to determine which provides the best performance for your specific use case.
81+
Each pattern offers different performance characteristics depending on your workload. The housekeeping pattern reduces system noise, paired cores optimize cache usage, and range assignment provides dedicated network processing capacity. Improper configuration can degrade performance or stability, so always test these patterns in a non-production environment to determine which provides the best results for your specific use case.
8882

8983
Continue to the next section for additional guidance.

0 commit comments

Comments
 (0)