Skip to content

Commit 9d31d23

Browse files
committed
Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes" * tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits) net: selftest: fix build issue if INET is disabled net: netrom: nr_in: Remove redundant assignment to ns net: tun: Remove redundant assignment to ret net: phy: marvell: add downshift support for M88E1240 net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255 net/sched: act_ct: Remove redundant ct get and check icmp: standardize naming of RFC 8335 PROBE constants bpf, selftests: Update array map tests for per-cpu batched ops bpf: Add batched ops support for percpu array bpf: Implement formatted output helpers with bstr_printf seq_file: Add a seq_bprintf function sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues net:nfc:digital: Fix a double free in digital_tg_recv_dep_req net: fix a concurrency bug in l2tp_tunnel_register() net/smc: Remove redundant assignment to rc mpls: Remove redundant assignment to err llc2: Remove redundant assignment to rc net/tls: Remove redundant initialization of record rds: Remove redundant assignment to nr_sig dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0 ...
2 parents 635de95 + 4a52dd8 commit 9d31d23

File tree

1,895 files changed

+121451
-35419
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,895 files changed

+121451
-35419
lines changed

Documentation/ABI/testing/sysfs-bus-pci

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -378,3 +378,32 @@ Description:
378378
The value comes from the PCI kernel device state and can be one
379379
of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold".
380380
The file is read only.
381+
382+
What: /sys/bus/pci/devices/.../sriov_vf_total_msix
383+
Date: January 2021
384+
Contact: Leon Romanovsky <[email protected]>
385+
Description:
386+
This file is associated with a SR-IOV physical function (PF).
387+
It contains the total number of MSI-X vectors available for
388+
assignment to all virtual functions (VFs) associated with PF.
389+
The value will be zero if the device doesn't support this
390+
functionality. For supported devices, the value will be
391+
constant and won't be changed after MSI-X vectors assignment.
392+
393+
What: /sys/bus/pci/devices/.../sriov_vf_msix_count
394+
Date: January 2021
395+
Contact: Leon Romanovsky <[email protected]>
396+
Description:
397+
This file is associated with a SR-IOV virtual function (VF).
398+
It allows configuration of the number of MSI-X vectors for
399+
the VF. This allows devices that have a global pool of MSI-X
400+
vectors to optimally divide them between VFs based on VF usage.
401+
402+
The values accepted are:
403+
* > 0 - this number will be reported as the Table Size in the
404+
VF's MSI-X capability
405+
* < 0 - not valid
406+
* = 0 - will reset to the device default value
407+
408+
The file is writable if the PF is bound to a driver that
409+
implements ->sriov_set_msix_vec_count().

Documentation/ABI/testing/sysfs-class-net-phydev

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,3 +51,15 @@ Description:
5151
Boolean value indicating whether the PHY device is used in
5252
standalone mode, without a net_device associated, by PHYLINK.
5353
Attribute created only when this is the case.
54+
55+
What: /sys/class/mdio_bus/<bus>/<device>/phy_dev_flags
56+
Date: March 2021
57+
KernelVersion: 5.13
58+
59+
Description:
60+
32-bit hexadecimal number representing a bit mask of the
61+
configuration bits passed from the consumer of the PHY
62+
(Ethernet MAC, switch, etc.) to the PHY driver. The flags are
63+
only used internally by the kernel and their placement are
64+
not meant to be stable across kernel versions. This is intended
65+
for facilitating the debugging of PHY drivers.

Documentation/admin-guide/sysctl/net.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,17 @@ permit to distribute the load on several cpus.
311311
If set to 1 (default), timestamps are sampled as soon as possible, before
312312
queueing.
313313

314+
netdev_unregister_timeout_secs
315+
------------------------------
316+
317+
Unregister network device timeout in seconds.
318+
This option controls the timeout (in seconds) used to issue a warning while
319+
waiting for a network device refcount to drop to 0 during device
320+
unregistration. A lower value may be useful during bisection to detect
321+
a leaked reference faster. A larger value may be useful to prevent false
322+
warnings on slow/loaded systems.
323+
Default value is 10, minimum 1, maximum 3600.
324+
314325
optmem_max
315326
----------
316327

Documentation/bpf/bpf_design_QA.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,3 +258,18 @@ Q: Can BPF functionality such as new program or map types, new
258258
helpers, etc be added out of kernel module code?
259259

260260
A: NO.
261+
262+
Q: Directly calling kernel function is an ABI?
263+
----------------------------------------------
264+
Q: Some kernel functions (e.g. tcp_slow_start) can be called
265+
by BPF programs. Do these kernel functions become an ABI?
266+
267+
A: NO.
268+
269+
The kernel function protos will change and the bpf programs will be
270+
rejected by the verifier. Also, for example, some of the bpf-callable
271+
kernel functions have already been used by other kernel tcp
272+
cc (congestion-control) implementations. If any of these kernel
273+
functions has changed, both the in-tree and out-of-tree kernel tcp cc
274+
implementations have to be changed. The same goes for the bpf
275+
programs and they have to be adjusted accordingly.

Documentation/bpf/bpf_devel_QA.rst

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ list:
2929
This may also include issues related to XDP, BPF tracing, etc.
3030

3131
Given netdev has a high volume of traffic, please also add the BPF
32-
maintainers to Cc (from kernel MAINTAINERS_ file):
32+
maintainers to Cc (from kernel ``MAINTAINERS`` file):
3333

3434
* Alexei Starovoitov <[email protected]>
3535
* Daniel Borkmann <[email protected]>
@@ -234,21 +234,21 @@ be subject to change.
234234

235235
Q: samples/bpf preference vs selftests?
236236
---------------------------------------
237-
Q: When should I add code to `samples/bpf/`_ and when to BPF kernel
238-
selftests_ ?
237+
Q: When should I add code to ``samples/bpf/`` and when to BPF kernel
238+
selftests_?
239239

240240
A: In general, we prefer additions to BPF kernel selftests_ rather than
241-
`samples/bpf/`_. The rationale is very simple: kernel selftests are
241+
``samples/bpf/``. The rationale is very simple: kernel selftests are
242242
regularly run by various bots to test for kernel regressions.
243243

244244
The more test cases we add to BPF selftests, the better the coverage
245245
and the less likely it is that those could accidentally break. It is
246246
not that BPF kernel selftests cannot demo how a specific feature can
247247
be used.
248248

249-
That said, `samples/bpf/`_ may be a good place for people to get started,
249+
That said, ``samples/bpf/`` may be a good place for people to get started,
250250
so it might be advisable that simple demos of features could go into
251-
`samples/bpf/`_, but advanced functional and corner-case testing rather
251+
``samples/bpf/``, but advanced functional and corner-case testing rather
252252
into kernel selftests.
253253

254254
If your sample looks like a test case, then go for BPF kernel selftests
@@ -449,6 +449,19 @@ from source at
449449

450450
https://github.com/acmel/dwarves
451451

452+
pahole starts to use libbpf definitions and APIs since v1.13 after the
453+
commit 21507cd3e97b ("pahole: add libbpf as submodule under lib/bpf").
454+
It works well with the git repository because the libbpf submodule will
455+
use "git submodule update --init --recursive" to update.
456+
457+
Unfortunately, the default github release source code does not contain
458+
libbpf submodule source code and this will cause build issues, the tarball
459+
from https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ is same with
460+
github, you can get the source tarball with corresponding libbpf submodule
461+
codes from
462+
463+
https://fedorapeople.org/~acme/dwarves
464+
452465
Some distros have pahole version 1.16 packaged already, e.g.
453466
Fedora, Gentoo.
454467

@@ -645,10 +658,9 @@ when:
645658

646659
.. Links
647660
.. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/
648-
.. _MAINTAINERS: ../../MAINTAINERS
649661
.. _netdev-FAQ: ../networking/netdev-FAQ.rst
650-
.. _samples/bpf/: ../../samples/bpf/
651-
.. _selftests: ../../tools/testing/selftests/bpf/
662+
.. _selftests:
663+
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/
652664
.. _Documentation/dev-tools/kselftest.rst:
653665
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
654666
.. _Documentation/bpf/btf.rst: btf.rst

Documentation/bpf/btf.rst

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ sequentially and type id is assigned to each recognized type starting from id
8484
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
8585
#define BTF_KIND_VAR 14 /* Variable */
8686
#define BTF_KIND_DATASEC 15 /* Section */
87+
#define BTF_KIND_FLOAT 16 /* Floating point */
8788

8889
Note that the type section encodes debug info, not just pure types.
8990
``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
@@ -95,8 +96,8 @@ Each type contains the following common data::
9596
/* "info" bits arrangement
9697
* bits 0-15: vlen (e.g. # of struct's members)
9798
* bits 16-23: unused
98-
* bits 24-27: kind (e.g. int, ptr, array...etc)
99-
* bits 28-30: unused
99+
* bits 24-28: kind (e.g. int, ptr, array...etc)
100+
* bits 29-30: unused
100101
* bit 31: kind_flag, currently used by
101102
* struct, union and fwd
102103
*/
@@ -452,6 +453,18 @@ map definition.
452453
* ``offset``: the in-section offset of the variable
453454
* ``size``: the size of the variable in bytes
454455

456+
2.2.16 BTF_KIND_FLOAT
457+
~~~~~~~~~~~~~~~~~~~~~
458+
459+
``struct btf_type`` encoding requirement:
460+
* ``name_off``: any valid offset
461+
* ``info.kind_flag``: 0
462+
* ``info.kind``: BTF_KIND_FLOAT
463+
* ``info.vlen``: 0
464+
* ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16.
465+
466+
No additional type data follow ``btf_type``.
467+
455468
3. BTF Kernel API
456469
*****************
457470

Documentation/bpf/index.rst

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,6 @@ BPF instruction-set.
1212
The Cilium project also maintains a `BPF and XDP Reference Guide`_
1313
that goes into great technical depth about the BPF Architecture.
1414

15-
The primary info for the bpf syscall is available in the `man-pages`_
16-
for `bpf(2)`_.
17-
1815
BPF Type Format (BTF)
1916
=====================
2017

@@ -35,6 +32,12 @@ Two sets of Questions and Answers (Q&A) are maintained.
3532
bpf_design_QA
3633
bpf_devel_QA
3734

35+
Syscall API
36+
===========
37+
38+
The primary info for the bpf syscall is available in the `man-pages`_
39+
for `bpf(2)`_. For more information about the userspace API, see
40+
Documentation/userspace-api/ebpf/index.rst.
3841

3942
Helper functions
4043
================
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
2+
%YAML 1.2
3+
---
4+
$id: http://devicetree.org/schemas/net/actions,owl-emac.yaml#
5+
$schema: http://devicetree.org/meta-schemas/core.yaml#
6+
7+
title: Actions Semi Owl SoCs Ethernet MAC Controller
8+
9+
maintainers:
10+
- Cristian Ciocaltea <[email protected]>
11+
12+
description: |
13+
This Ethernet MAC is used on the Owl family of SoCs from Actions Semi.
14+
It provides the RMII and SMII interfaces and is compliant with the
15+
IEEE 802.3 CSMA/CD standard, supporting both half-duplex and full-duplex
16+
operation modes at 10/100 Mb/s data transfer rates.
17+
18+
allOf:
19+
- $ref: "ethernet-controller.yaml#"
20+
21+
properties:
22+
compatible:
23+
oneOf:
24+
- const: actions,owl-emac
25+
- items:
26+
- enum:
27+
- actions,s500-emac
28+
- const: actions,owl-emac
29+
30+
reg:
31+
maxItems: 1
32+
33+
interrupts:
34+
maxItems: 1
35+
36+
clocks:
37+
minItems: 2
38+
maxItems: 2
39+
40+
clock-names:
41+
additionalItems: false
42+
items:
43+
- const: eth
44+
- const: rmii
45+
46+
resets:
47+
maxItems: 1
48+
49+
actions,ethcfg:
50+
$ref: /schemas/types.yaml#/definitions/phandle
51+
description:
52+
Phandle to the device containing custom config.
53+
54+
required:
55+
- compatible
56+
- reg
57+
- interrupts
58+
- clocks
59+
- clock-names
60+
- resets
61+
- phy-mode
62+
- phy-handle
63+
64+
unevaluatedProperties: false
65+
66+
examples:
67+
- |
68+
#include <dt-bindings/clock/actions,s500-cmu.h>
69+
#include <dt-bindings/interrupt-controller/arm-gic.h>
70+
#include <dt-bindings/reset/actions,s500-reset.h>
71+
72+
ethernet@b0310000 {
73+
compatible = "actions,s500-emac", "actions,owl-emac";
74+
reg = <0xb0310000 0x10000>;
75+
interrupts = <GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>;
76+
clocks = <&cmu 59 /*CLK_ETHERNET*/>, <&cmu CLK_RMII_REF>;
77+
clock-names = "eth", "rmii";
78+
resets = <&cmu RESET_ETHERNET>;
79+
phy-mode = "rmii";
80+
phy-handle = <&eth_phy>;
81+
82+
mdio {
83+
#address-cells = <1>;
84+
#size-cells = <0>;
85+
86+
eth_phy: ethernet-phy@3 {
87+
reg = <0x3>;
88+
interrupt-parent = <&sirq>;
89+
interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
90+
};
91+
};
92+
};

Documentation/devicetree/bindings/net/brcm,bcm4908-enet.yaml

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,18 @@ properties:
2222
maxItems: 1
2323

2424
interrupts:
25-
description: RX interrupt
25+
minItems: 1
26+
maxItems: 2
27+
items:
28+
- description: RX interrupt
29+
- description: TX interrupt
2630

2731
interrupt-names:
28-
const: rx
32+
minItems: 1
33+
maxItems: 2
34+
items:
35+
- const: rx
36+
- const: tx
2937

3038
required:
3139
- reg
@@ -43,6 +51,7 @@ examples:
4351
compatible = "brcm,bcm4908-enet";
4452
reg = <0x80002000 0x1000>;
4553
46-
interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>;
47-
interrupt-names = "rx";
54+
interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
55+
<GIC_SPI 87 IRQ_TYPE_LEVEL_HIGH>;
56+
interrupt-names = "rx", "tx";
4857
};

0 commit comments

Comments
 (0)