You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(feat)linux: Add XDP zero copy support documentation for ICSSG
Update existing XDP documentation and add documentation
support for AF_XDP and zero copy for PRU_ICSS driver.
Signed-off-by: Meghana Malladi <[email protected]>
XDP stands for eXpress Data Path and provides a framework for BPF that enables high-performance programmable packet processing in the Linux kernel. It runs the BPF program at the earliest possible point in software, namely at the moment the network driver receives the packet.
14
13
@@ -18,18 +17,168 @@ XDP allows running a BPF program just before the skbs are allocated in the drive
18
17
- XDP_ABORTED :- Similar to drop, an exception is generated.
19
18
- XDP_PASS :- Pass the packet to kernel stack, i.e. the skbs are allocated and it works normally.
20
19
- XDP_TX :- Send the packet back to same NIC with modification(if done by the program).
21
-
- XDP_REDIRECT :- Send the packet to another NIC or to the userspace through AF_XDP Socket(discussed below).
20
+
- XDP_REDIRECT :- Send the packet to another NIC or to the user space through AF_XDP Socket(discussed below).
22
21
23
-
.. Image:: /images/xdp-packet-processing.png
22
+
.. Image:: /images/XDP-packet-processing.png
24
23
25
-
As explained above, the XDP_REDIRECT can be used to send a packet directly to the userspace.
24
+
As explained above, the XDP_REDIRECT can be used to send a packet directly to the user space.
26
25
This works by using the AF_XDP socket type which was introduced specifically for this usecase.
27
26
28
-
In this process, the packet is directly sent to the userspace without going through the kernel network stack.
27
+
In this process, the packet is directly sent to the user space without going through the kernel network stack.
29
28
30
29
.. Image:: /images/xdp-packet.png
31
30
32
-
Running XDP on EVM
33
-
##################
31
+
Use Cases for XDP
32
+
=================
34
33
35
-
The ICSSG driver supports XDP. Any application based on XDP can use ICSSG XDP Capablities. By default CONFIG_XDP_SOCKETS is enabled in .config of ti-linux-kernel.
34
+
XDP is particularly useful for these common networking scenarios:
35
+
36
+
1. **DDoS Mitigation**: High-speed packet filtering and dropping malicious traffic
37
+
2. **Load Balancing**: Efficient traffic distribution across multiple servers
38
+
3. **Packet Capture**: High-performance network monitoring without performance penalties
39
+
4. **Firewalls**: Wire-speed packet filtering based on flexible rule sets
40
+
5. **Network Analytics**: Real-time traffic analysis and monitoring
41
+
6. **Custom Network Functions**: Specialized packet handling for unique requirements
42
+
43
+
How to run XDP with PRU_ICSSG
44
+
=============================
45
+
46
+
The kernel configuration requires the following changes to use XDP with PRU_ICSSG:
47
+
48
+
.. code-block:: console
49
+
50
+
CONFIG_DEBUG_INFO_BTF=y
51
+
CONFIG_BPF_PRELOAD=y
52
+
CONFIG_BPF_PRELOAD_UMD=y
53
+
CONFIG_BPF_EVENTS=y
54
+
CONFIG_BPF_LSM=y
55
+
CONFIG_DEBUG_INFO_REDUCED=n
56
+
CONFIG_FTRACE=y
57
+
CONFIG_XDP_SOCKETS=y
58
+
59
+
Tools for debugging XDP Applications
60
+
====================================
61
+
62
+
Debugging tools for XDP development:
63
+
64
+
- bpftool - For loading and managing BPF programs
65
+
- xdpdump - For capturing XDP packet data
66
+
- perf - For performance monitoring and analysis
67
+
- bpftrace - For tracing BPF program execution
68
+
69
+
AF_XDP Sockets
70
+
**************
71
+
72
+
What are AF_XDP Sockets?
73
+
========================
74
+
75
+
AF_XDP is a socket address family specifically designed to work with the XDP framework.
76
+
These sockets provide a high-performance interface for user space applications to receive
77
+
and transmit network packets directly from the XDP layer, bypassing the traditional kernel networking stack.
78
+
79
+
Key characteristics of AF_XDP sockets include:
80
+
81
+
- Direct path from network driver to user space applications
82
+
- Shared memory rings for efficient packet transfer
83
+
- Minimal overhead compared to traditional socket interfaces
84
+
- Optimized for high-throughput, low-latency applications
85
+
86
+
How AF_XDP Works
87
+
================
88
+
89
+
AF_XDP sockets operate through a shared memory mechanism:
90
+
91
+
1. XDP program intercepts packets at driver level
92
+
2. XDP_REDIRECT action sends packets to the socket
93
+
3. Shared memory rings (RX/TX/FILL/COMPLETION) manage packet data
94
+
4. Userspace application directly accesses the packet data
95
+
5. Zero or minimal copying depending on the mode used
96
+
97
+
The AF_XDP architecture uses several ring buffers:
98
+
99
+
- **RX Ring**: Received packets ready for consumption
100
+
- **TX Ring**: Packets to be transmitted
101
+
- **FILL Ring**: Pre-allocated buffers for incoming packets
For more details on AF_XDP please refer to the official documentation: `AF_XDP Sockets <https://www.kernel.org/doc/html/latest/networking/af_xdp.html>`_.
105
+
106
+
Current Support Status in PRU_ICSSG
107
+
===================================
108
+
109
+
The PRU_ICSSG Ethernet driver currently supports:
110
+
111
+
- Native XDP mode
112
+
- Generic XDP mode (SKB-based)
113
+
- Zero-copy mode
114
+
115
+
XDP Zero-Copy in PRU_ICSSG
116
+
**************************
117
+
118
+
Introduction to Zero-Copy Mode
119
+
==============================
120
+
121
+
Zero-copy mode is an optimization in AF_XDP that eliminates packet data copying between the kernel and user space. This results in significantly improved performance for high-throughput network applications.
122
+
123
+
How Zero-Copy Works
124
+
===================
125
+
126
+
In standard XDP operation (copy mode), packet data is copied from kernel memory to user space memory when processed. Zero-copy mode eliminates this copy operation by:
127
+
128
+
1. Using memory-mapped regions shared between the kernel and user space
129
+
2. Allowing direct DMA from network hardware into memory accessible by user space applications
130
+
3. Managing memory ownership through descriptor rings rather than data movement
131
+
132
+
This approach provides several benefits:
133
+
- Reduced CPU utilization
134
+
- Lower memory bandwidth consumption
135
+
- Decreased latency for packet processing
136
+
- Improved overall throughput
137
+
138
+
Requirements for Zero-Copy
139
+
==========================
140
+
141
+
For zero-copy to function properly with PRU_ICSSG, ensure:
142
+
143
+
1. **Driver Support**: Verify the PRU_ICSSG driver is loaded with zero-copy support enabled
144
+
2. **Memory Alignment**: Buffer addresses must be properly aligned to page boundaries
145
+
3. **UMEM Configuration**: The UMEM area must be correctly configured:
146
+
- Properly aligned memory allocation
147
+
- Sufficient number of packet buffers
148
+
- Appropriate buffer sizes
149
+
4. **Hugepages**: Using hugepages for UMEM allocation is recommended for optimal performance
150
+
151
+
Performance Comparison
152
+
======================
153
+
154
+
Performance testing shows that zero-copy mode can provide substantial throughput improvements compared to copy mode:
155
+
156
+
`xdpsock <https://github.com/xdp-project/bpf-examples/tree/main/AF_XDP-example>`_ opensource tool was used for testing XDP zero copy.
157
+
AF_XDP performance while using 64 byte packets in Kpps:
158
+
159
+
.. list-table::
160
+
:header-rows: 1
161
+
162
+
* - Benchmark
163
+
- XDP-SKB
164
+
- XDP-Native
165
+
- XDP-Native(ZeroCopy)
166
+
* - rxdrop
167
+
- 253
168
+
- 473
169
+
- 656
170
+
* - txonly
171
+
- 350
172
+
- 354
173
+
- 855
174
+
175
+
Performance Considerations
176
+
==========================
177
+
178
+
When implementing XDP applications, consider these performance factors:
179
+
180
+
1. **Memory Alignment**: Buffers should be aligned to page boundaries for optimal performance
181
+
2. **Batch Processing**: Process multiple packets in batches when possible
182
+
3. **Poll Mode**: Use poll() or similar mechanisms to avoid blocking on socket operations
183
+
4. **Core Affinity**: Bind application threads to specific CPU cores to reduce cache contention
184
+
5. **NUMA Awareness**: Consider NUMA topology when allocating memory for packet buffers
0 commit comments