Skip to content

Commit ff8607f

Browse files
Merge pull request #2388 from jasonrandrews/review
tech review of IRQ performance Learning Path
2 parents 7d75d41 + 2663331 commit ff8607f

File tree

4 files changed

+164
-52
lines changed

4 files changed

+164
-52
lines changed

content/learning-paths/servers-and-cloud-computing/irq-tuning-guide/_index.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,30 @@
11
---
2-
title: Learn about the impact of NIC IRQs and patterns on cloud
2+
title: Learn about the impact of network interrupts on cloud workloads
33

44
draft: true
55
cascade:
66
draft: true
77

88
minutes_to_complete: 20
99

10-
who_is_this_for: This is anyone interested in understanding how IRQ patterns can enhance networking workload performance on cloud.
11-
10+
who_is_this_for: This is a specialized topic for developers and performance engineers who are interested in understanding how network interrupt patterns can impact performance on cloud servers.
1211

1312
learning_objectives:
14-
- Analyze the current IRQ layout on the machine.
15-
- Test different options and patterns to improve performance.
13+
- Analyze the current interrupt request (IRQ) layout on an Arm Linux system
14+
- Experiment with different interrupt options and patterns to improve performance
1615

1716
prerequisites:
18-
- An Arm computer running Linux installed.
19-
- Some familiarity with running Linux command line commands.
17+
- An Arm computer running Linux
18+
- Some familiarity with the Linux command line
2019

2120
author: Kiel Friedt
2221

2322
### Tags
2423
skilllevels: Introductory
2524
subjects: Performance and Architecture
2625
armips:
27-
- AArch64
26+
- Neoverse
27+
- Cortex-A
2828
tools_software_languages:
2929

3030
operatingsystems:
Lines changed: 69 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,39 @@
11
---
2-
title: checking IRQs
2+
title: Understand and Analyze network IRQ configuration
33
weight: 2
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
First you should run the following command to identify all IRQs on the system. Identify the NIC IRQs and adjust the system by experimenting and seeing how performance improves.
9+
## Why IRQ management matters for performance
1010

11-
```
11+
In modern cloud environments, network performance is critical to overall system efficiency. Network interface cards (NICs) generate interrupt requests (IRQs) to notify the CPU when data packets arrive or need to be sent. These interrupts temporarily pause normal processing, allowing the system to handle network traffic.
12+
13+
By default, Linux distributes these network interrupts across available CPU cores. However, this distribution is not always optimal for performance:
14+
15+
- High interrupt rates: In busy servers, network cards can generate thousands of interrupts per second
16+
- CPU cache locality: Processing related network operations on the same CPU core improves cache efficiency
17+
- Resource contention: When network IRQs compete with application workloads for the same CPU resources, both can suffer
18+
- Power efficiency: IRQ management can help reduce unnecessary CPU wake-ups, improving energy efficiency
19+
20+
Understanding and optimizing IRQ assignment allows you to balance network processing loads, reduce latency, and maximize throughput for your specific workloads.
21+
22+
## Identifying IRQs on your system
23+
24+
To get started, run this command to display all IRQs on your system and their CPU assignments:
25+
26+
```bash
1227
grep '' /proc/irq/*/smp_affinity_list | while IFS=: read path cpus; do
1328
irq=$(basename $(dirname $path))
1429
device=$(grep -E "^ *$irq:" /proc/interrupts | awk '{print $NF}')
1530
printf "IRQ %s -> CPUs %s -> Device %s\n" "$irq" "$cpus" "$device"
1631
done
1732
```
1833

34+
The output is very long and looks similar to:
1935

20-
{{% notice Note %}}
21-
output should look similar to this:
22-
```
36+
```output
2337
IRQ 104 -> CPUs 12 -> Device ens34-Tx-Rx-5
2438
IRQ 105 -> CPUs 5 -> Device ens34-Tx-Rx-6
2539
IRQ 106 -> CPUs 10 -> Device ens34-Tx-Rx-7
@@ -33,12 +47,43 @@ IRQ 21 -> CPUs 0-15 -> Device ACPI:Ged
3347
...
3448
IRQ 26 -> CPUs 0-15 -> Device ACPI:Ged
3549
```
36-
{{% /notice %}}
3750

38-
Now, you may notice that the NIC IRQs are assigned to a duplicate CPU by default.
51+
## How to identify network IRQs
52+
53+
Network-related IRQs can be identified by looking at the "Device" column in the output.
3954

40-
like this example:
55+
You can identify network interfaces using the command:
56+
57+
```bash
58+
ip link show
4159
```
60+
61+
Here are some common patterns to look for:
62+
63+
Common interface naming patterns include `eth0` for traditional ethernet, `enP3p3s0f0` and `ens5-Tx-Rx-0` for the Linux predictable naming scheme, or `wlan0` for wireless.
64+
65+
The predictable naming scheme breaks down into:
66+
67+
- en = ethernet
68+
- P3 = PCI domain 3
69+
- p3 = PCI bus 3
70+
- s0 = PCI slot 0
71+
- f0 = function 0
72+
73+
This naming convention helps ensure network interfaces have consistent names across reboots by encoding their physical
74+
location in the system.
75+
76+
## Improve performance
77+
78+
Once you've identified the network IRQs, you can adjust their CPU assignments to try to improve performance.
79+
80+
Identify the NIC (Network Interface Card) IRQs and adjust the system by experimenting and seeing if performance improves.
81+
82+
You may notice that some NIC IRQs are assigned to the same CPU cores by default, creating duplicate assignments.
83+
84+
For example:
85+
86+
```output
4287
IRQ 100 -> CPUs 2 -> Device ens34-Tx-Rx-1
4388
IRQ 101 -> CPUs 12 -> Device ens34-Tx-Rx-2
4489
IRQ 102 -> CPUs 14 -> Device ens34-Tx-Rx-3
@@ -47,26 +92,30 @@ IRQ 104 -> CPUs 12 -> Device ens34-Tx-Rx-5
4792
IRQ 105 -> CPUs 5 -> Device ens34-Tx-Rx-6
4893
IRQ 106 -> CPUs 10 -> Device ens34-Tx-Rx-7
4994
```
50-
This can potential hurt performance. Suggestions and patterns to experiment with will be on the next step.
5195

52-
### reset
96+
## Understanding IRQ performance impact
5397

54-
If performance reduces, you can return the IRQs back to default using the following commands.
98+
When network IRQs are assigned to the same CPU cores (as shown in the example above where IRQ 101 and 104 both use CPU 12), this can potentially hurt performance as multiple interrupts compete for the same CPU core's attention, while other cores remain underutilized.
5599

56-
```
100+
By optimizing IRQ distribution, you can achieve more balanced processing and improved throughput. This optimization is especially important for high-traffic servers where network performance is critical.
101+
102+
Suggested experiments are covered in the next section.
103+
104+
### How can I reset my IRQs if I make performance worse?
105+
106+
If your experiments reduce performance, you can return the IRQs back to default using the following commands:
107+
108+
```bash
57109
sudo systemctl unmask irqbalance
58110
sudo systemctl enable --now irqbalance
59111
```
60112

61-
or you can run the following
113+
If needed, install `irqbalance` on your system. For Debian based systems run:
62114

63-
```
64-
DEF=$(cat /proc/irq/default_smp_affinity)
65-
for f in /proc/irq/*/smp_affinity; do
66-
echo "$DEF" | sudo tee "$f" >/dev/null || true
67-
done
115+
```bash
116+
sudo apt install irqbalance
68117
```
69118

70119
### Saving these changes
71120

72-
Any changes you make to IRQs will be reset at reboot. You will need to change your systems settings to make your changes permanent.
121+
Any changes you make to IRQs will be reset at reboot. You will need to change your system's settings to make your changes permanent.
Lines changed: 41 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,51 @@
11
---
2-
title: conclusion
2+
title: Conclusion and recommendations
33
weight: 4
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
While a single pattern does not work for all workloads. Our testing found that under heavy network workloads, different patterns performed better based on sizing.
9+
## Optimal IRQ Management Strategies
1010

11-
### upto and under 16 vCPUs
12-
For best performance, reduce NIC IRQs to either one or two cores. Otherwise random or default performed second best.
11+
Testing across multiple cloud platforms reveals that IRQ management effectiveness varies significantly based on system size and workload characteristics. No single pattern works optimally for all scenarios, but clear patterns emerged during performance testing under heavy network loads.
1312

14-
*If the number of NIC IRQS are more then the number of vCPUs, concentrating them over less cores improved performance significantly.
13+
## Recommendations by system size
1514

16-
### over 16 vCPUs
17-
No pattern showed significant improvement over default as long as all NIC IRQs were not on duplicate cores.
15+
### Systems with 16 vCPUs or less
16+
17+
For smaller systems with 16 or less vCPUs, concentrated IRQ assignment may provide measurable performance improvements.
18+
19+
- Assign all network IRQs to just one or two CPU cores
20+
- This approach showed the most significant performance gains
21+
- Most effective when the number of NIC IRQs exceeds the number of vCPUs
22+
- Use the `smp_affinity` range assignment pattern from the previous section with a very limited core range, for example `0-1`
23+
24+
Performance improves significantly when network IRQs are concentrated rather than dispersed across all available cores on smaller systems.
25+
26+
### Systems with more than 16 vCPUs
27+
28+
For larger systems with more than 16 vCPUs, the findings are different:
29+
30+
- Default IRQ distribution generally performs well
31+
- The primary concern is avoiding duplicate core assignments for network IRQs
32+
- Use the scripts from the previous section to check and correct any overlapping IRQ assignments
33+
- The paired core pattern can help ensure optimal distribution on these larger systems
34+
35+
On larger systems, the overhead of interrupt handling is proportionally smaller compared to the available processing power. The main performance bottleneck occurs when multiple high-frequency network interrupts compete for the same core.
36+
37+
## Implementation Considerations
38+
39+
When implementing these IRQ management strategies, there are some important points to keep in mind.
40+
41+
Pay attention to the workload type. CPU-bound applications may benefit from different IRQ patterns than I/O-bound applications.
42+
43+
Always benchmark your specific workload with different IRQ patterns.
44+
45+
Monitor IRQ counts in real-time using `watch -n1 'grep . /proc/interrupts'` to observe IRQ distribution in real-time.
46+
47+
Also consider NUMA effects on multi-socket systems. Keep IRQs on cores close to the PCIe devices generating them to minimize cross-node memory access.
48+
49+
Make sure to set up IRQ affinity settings in `/etc/rc.local` or a systemd service file to ensure they persist across reboots.
50+
51+
Remember that as workloads and hardware evolve, revisiting and adjusting IRQ management strategies may be necessary to maintain optimal performance.
Lines changed: 46 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,58 @@
11
---
2-
title: patterns
2+
title: IRQ management patterns for performance optimization
33
weight: 3
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
The following patterns were ran on multiple cloud and on a variety of sizes. A recommended IRQ pattern will be suggested at the end. Based on your workload, a different pattern may result in higher performance.
9+
## Optimizing network performance with IRQ management
10+
11+
Different IRQ management patterns can significantly impact network performance across multiple cloud platforms and virtual machine sizes. This Learning Path presents various IRQ distribution strategies, along with scripts to implement them on your systems.
12+
13+
Network interrupt requests (IRQs) can be distributed across CPU cores in various ways, each with potential benefits depending on your workload characteristics and system configuration. By strategically assigning network IRQs to specific cores, you can improve cache locality, reduce contention, and potentially boost overall system performance.
14+
15+
The following patterns have been tested on various systems and can be implemented using the provided scripts. An optimal pattern is suggested at the conclusion of this Learning Path, but your specific workload may benefit from a different approach.
1016

1117
### Patterns
18+
1219
1. Default: IRQ pattern provided at boot.
1320
2. Random: All IRQs are assigned a core and do not overlap with network IRQs.
14-
3. Housekeeping: All IRQs outside of network IRQs are assign to specific core(s).
15-
4. NIC IRQs are set to single or multiple ranges of cores and including pairs. EX. 1, 1-2, 0-3, 0-7, [0-1, 2-3..], etc.
16-
21+
3. Housekeeping: All IRQs outside of network IRQs are assigned to specific core(s).
22+
4. NIC IRQs are assigned to single or multiple ranges of cores, including pairs.
1723

1824
### Scripts to change IRQ
1925

26+
The scripts below demonstrate how to implement different IRQ management patterns on your system. Each script targets a specific distribution strategy:
27+
28+
Before running these scripts, identify your network interface name using `ip link show` and determine your system's CPU topology with `lscpu`. Always test these changes in a non-production environment first, as improper IRQ assignment can impact system stability.
29+
2030
To change the NIC IRQs or IRQs in general you can use the following scripts.
2131

22-
Housekeeping pattern example, you will need to add more to account for other IRQs on your system
32+
### Housekeeping
2333

24-
```
25-
HOUSEKEEP=#core range here
34+
The housekeeping pattern isolates non-network IRQs to dedicated cores.
35+
36+
You need to add more to account for other IRQs on your system.
37+
38+
```bash
39+
HOUSEKEEP=#core range here (example: "0,3")
2640

27-
# ACPI:Ged
2841
for irq in $(awk '/ACPI:Ged/ {sub(":","",$1); print $1}' /proc/interrupts); do
2942
echo $HOUSEKEEP | sudo tee /proc/irq/$irq/smp_affinity_list >/dev/null
3043
done
3144
```
3245

33-
This is for pairs on a 16 vCPU machine, you will need the interface name.
46+
### Paired core
3447

35-
```
36-
IFACE=#interface name
48+
The paired core assignment pattern distributes network IRQs across CPU core pairs for better cache coherency.
49+
50+
This is for pairs on a 16 vCPU machine.
51+
52+
You need to add the interface name.
53+
54+
```bash
55+
IFACE=#interface name (example: "ens5")
3756

3857
PAIRS=("0,1" "2,3" "4,5" "6,7" "8,9" "10,11" "12,13" "14,15")
3958

@@ -49,12 +68,22 @@ for irq in "${irqs[@]}"; do
4968
done
5069
```
5170

52-
This will assign a specific core(s) to NIC IRQs only
71+
### Range assignment
5372

54-
```
55-
IFACE=#interface name
73+
The range assignment pattern assigns network IRQs to a specific range of cores.
74+
75+
This will assign a specific core(s) to NIC IRQs only.
5676

57-
for irq in $(awk '/$IFACE/ {sub(":","",$1); print $1}' /proc/interrupts); do
77+
You need to add the interface name.
78+
79+
```bash
80+
IFACE=#interface name (example: "ens5")
81+
82+
for irq in $(awk '/'$IFACE'/ {sub(":","",$1); print $1}' /proc/interrupts); do
5883
echo 0-15 | sudo tee /proc/irq/$irq/smp_affinity_list > /dev/null
5984
done
60-
```
85+
```
86+
87+
Each pattern offers different performance characteristics depending on your workload. The housekeeping pattern reduces system noise, paired cores optimize cache usage, and range assignment provides dedicated network processing capacity. Test these patterns in your environment to determine which provides the best performance for your specific use case.
88+
89+
Continue to the next section for additional guidance.

0 commit comments

Comments
 (0)