-
Notifications
You must be signed in to change notification settings - Fork 10
[LTS 8.6 RT] net: sched: sch_qfq: Fix UAF in qfq_dequeue() #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LTS 8.6 RT] net: sched: sch_qfq: Fix UAF in qfq_dequeue() #117
Conversation
jira VULN-6729 cve CVE-2023-4921 commit-author valis <[email protected]> commit 8fc134f When the plug qdisc is used as a class of the qfq qdisc it could trigger a UAF. This issue can be reproduced with following commands: tc qdisc add dev lo root handle 1: qfq tc class add dev lo parent 1: classid 1:1 qfq weight 1 maxpkt 512 tc qdisc add dev lo parent 1:1 handle 2: plug tc filter add dev lo parent 1: basic classid 1:1 ping -c1 127.0.0.1 and boom: [ 285.353793] BUG: KASAN: slab-use-after-free in qfq_dequeue+0xa7/0x7f0 [ 285.354910] Read of size 4 at addr ffff8880bad312a8 by task ping/144 [ 285.355903] [ 285.356165] CPU: 1 PID: 144 Comm: ping Not tainted 6.5.0-rc3+ ctrliq#4 [ 285.357112] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 285.358376] Call Trace: [ 285.358773] <IRQ> [ 285.359109] dump_stack_lvl+0x44/0x60 [ 285.359708] print_address_description.constprop.0+0x2c/0x3c0 [ 285.360611] kasan_report+0x10c/0x120 [ 285.361195] ? qfq_dequeue+0xa7/0x7f0 [ 285.361780] qfq_dequeue+0xa7/0x7f0 [ 285.362342] __qdisc_run+0xf1/0x970 [ 285.362903] net_tx_action+0x28e/0x460 [ 285.363502] __do_softirq+0x11b/0x3de [ 285.364097] do_softirq.part.0+0x72/0x90 [ 285.364721] </IRQ> [ 285.365072] <TASK> [ 285.365422] __local_bh_enable_ip+0x77/0x90 [ 285.366079] __dev_queue_xmit+0x95f/0x1550 [ 285.366732] ? __pfx_csum_and_copy_from_iter+0x10/0x10 [ 285.367526] ? __pfx___dev_queue_xmit+0x10/0x10 [ 285.368259] ? __build_skb_around+0x129/0x190 [ 285.368960] ? ip_generic_getfrag+0x12c/0x170 [ 285.369653] ? __pfx_ip_generic_getfrag+0x10/0x10 [ 285.370390] ? csum_partial+0x8/0x20 [ 285.370961] ? raw_getfrag+0xe5/0x140 [ 285.371559] ip_finish_output2+0x539/0xa40 [ 285.372222] ? __pfx_ip_finish_output2+0x10/0x10 [ 285.372954] ip_output+0x113/0x1e0 [ 285.373512] ? __pfx_ip_output+0x10/0x10 [ 285.374130] ? icmp_out_count+0x49/0x60 [ 285.374739] ? __pfx_ip_finish_output+0x10/0x10 [ 285.375457] ip_push_pending_frames+0xf3/0x100 [ 285.376173] raw_sendmsg+0xef5/0x12d0 [ 285.376760] ? do_syscall_64+0x40/0x90 [ 285.377359] ? __static_call_text_end+0x136578/0x136578 [ 285.378173] ? do_syscall_64+0x40/0x90 [ 285.378772] ? kasan_enable_current+0x11/0x20 [ 285.379469] ? __pfx_raw_sendmsg+0x10/0x10 [ 285.380137] ? __sock_create+0x13e/0x270 [ 285.380673] ? __sys_socket+0xf3/0x180 [ 285.381174] ? __x64_sys_socket+0x3d/0x50 [ 285.381725] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 285.382425] ? __rcu_read_unlock+0x48/0x70 [ 285.382975] ? ip4_datagram_release_cb+0xd8/0x380 [ 285.383608] ? __pfx_ip4_datagram_release_cb+0x10/0x10 [ 285.384295] ? preempt_count_sub+0x14/0xc0 [ 285.384844] ? __list_del_entry_valid+0x76/0x140 [ 285.385467] ? _raw_spin_lock_bh+0x87/0xe0 [ 285.386014] ? __pfx__raw_spin_lock_bh+0x10/0x10 [ 285.386645] ? release_sock+0xa0/0xd0 [ 285.387148] ? preempt_count_sub+0x14/0xc0 [ 285.387712] ? freeze_secondary_cpus+0x348/0x3c0 [ 285.388341] ? aa_sk_perm+0x177/0x390 [ 285.388856] ? __pfx_aa_sk_perm+0x10/0x10 [ 285.389441] ? check_stack_object+0x22/0x70 [ 285.390032] ? inet_send_prepare+0x2f/0x120 [ 285.390603] ? __pfx_inet_sendmsg+0x10/0x10 [ 285.391172] sock_sendmsg+0xcc/0xe0 [ 285.391667] __sys_sendto+0x190/0x230 [ 285.392168] ? __pfx___sys_sendto+0x10/0x10 [ 285.392727] ? kvm_clock_get_cycles+0x14/0x30 [ 285.393328] ? set_normalized_timespec64+0x57/0x70 [ 285.393980] ? _raw_spin_unlock_irq+0x1b/0x40 [ 285.394578] ? __x64_sys_clock_gettime+0x11c/0x160 [ 285.395225] ? __pfx___x64_sys_clock_gettime+0x10/0x10 [ 285.395908] ? _copy_to_user+0x3e/0x60 [ 285.396432] ? exit_to_user_mode_prepare+0x1a/0x120 [ 285.397086] ? syscall_exit_to_user_mode+0x22/0x50 [ 285.397734] ? do_syscall_64+0x71/0x90 [ 285.398258] __x64_sys_sendto+0x74/0x90 [ 285.398786] do_syscall_64+0x64/0x90 [ 285.399273] ? exit_to_user_mode_prepare+0x1a/0x120 [ 285.399949] ? syscall_exit_to_user_mode+0x22/0x50 [ 285.400605] ? do_syscall_64+0x71/0x90 [ 285.401124] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 285.401807] RIP: 0033:0x495726 [ 285.402233] Code: ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 09 [ 285.404683] RSP: 002b:00007ffcc25fb618 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 285.405677] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 0000000000495726 [ 285.406628] RDX: 0000000000000040 RSI: 0000000002518750 RDI: 0000000000000000 [ 285.407565] RBP: 00000000005205ef R08: 00000000005f8838 R09: 000000000000001c [ 285.408523] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000002517634 [ 285.409460] R13: 00007ffcc25fb6f0 R14: 0000000000000003 R15: 0000000000000000 [ 285.410403] </TASK> [ 285.410704] [ 285.410929] Allocated by task 144: [ 285.411402] kasan_save_stack+0x1e/0x40 [ 285.411926] kasan_set_track+0x21/0x30 [ 285.412442] __kasan_slab_alloc+0x55/0x70 [ 285.412973] kmem_cache_alloc_node+0x187/0x3d0 [ 285.413567] __alloc_skb+0x1b4/0x230 [ 285.414060] __ip_append_data+0x17f7/0x1b60 [ 285.414633] ip_append_data+0x97/0xf0 [ 285.415144] raw_sendmsg+0x5a8/0x12d0 [ 285.415640] sock_sendmsg+0xcc/0xe0 [ 285.416117] __sys_sendto+0x190/0x230 [ 285.416626] __x64_sys_sendto+0x74/0x90 [ 285.417145] do_syscall_64+0x64/0x90 [ 285.417624] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 285.418306] [ 285.418531] Freed by task 144: [ 285.418960] kasan_save_stack+0x1e/0x40 [ 285.419469] kasan_set_track+0x21/0x30 [ 285.419988] kasan_save_free_info+0x27/0x40 [ 285.420556] ____kasan_slab_free+0x109/0x1a0 [ 285.421146] kmem_cache_free+0x1c2/0x450 [ 285.421680] __netif_receive_skb_core+0x2ce/0x1870 [ 285.422333] __netif_receive_skb_one_core+0x97/0x140 [ 285.423003] process_backlog+0x100/0x2f0 [ 285.423537] __napi_poll+0x5c/0x2d0 [ 285.424023] net_rx_action+0x2be/0x560 [ 285.424510] __do_softirq+0x11b/0x3de [ 285.425034] [ 285.425254] The buggy address belongs to the object at ffff8880bad31280 [ 285.425254] which belongs to the cache skbuff_head_cache of size 224 [ 285.426993] The buggy address is located 40 bytes inside of [ 285.426993] freed 224-byte region [ffff8880bad31280, ffff8880bad31360) [ 285.428572] [ 285.428798] The buggy address belongs to the physical page: [ 285.429540] page:00000000f4b77674 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbad31 [ 285.430758] flags: 0x100000000000200(slab|node=0|zone=1) [ 285.431447] page_type: 0xffffffff() [ 285.431934] raw: 0100000000000200 ffff88810094a8c0 dead000000000122 0000000000000000 [ 285.432757] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000 [ 285.433562] page dumped because: kasan: bad access detected [ 285.434144] [ 285.434320] Memory state around the buggy address: [ 285.434828] ffff8880bad31180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 285.435580] ffff8880bad31200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 285.436264] >ffff8880bad31280: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 285.436777] ^ [ 285.437106] ffff8880bad31300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 285.437616] ffff8880bad31380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 285.438126] ================================================================== [ 285.438662] Disabling lock debugging due to kernel taint Fix this by: 1. Changing sch_plug's .peek handler to qdisc_peek_dequeued(), a function compatible with non-work-conserving qdiscs 2. Checking the return value of qdisc_dequeue_peeked() in sch_qfq. Fixes: 462dbc9 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Reported-by: valis <[email protected]> Signed-off-by: valis <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit 8fc134f) Signed-off-by: Marcin Wcisło <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting github runners too
EDIT: I didn't notice this was in draft ahead of time.
Need to add the selftests, they're on the way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In line with previous commits in the CVE series
The reproduction steps: 1. create a tun interface 2. enable l2 bearer 3. TIPC_NL_UDP_GET_REMOTEIP with media name set to tun tipc: Started in network mode tipc: Node identity 8af312d38a21, cluster identity 4711 tipc: Enabled bearer <eth:syz_tun>, priority 1 Oops: general protection fault KASAN: null-ptr-deref in range CPU: 1 UID: 1000 PID: 559 Comm: poc Not tainted 6.16.0-rc1+ #117 PREEMPT Hardware name: QEMU Ubuntu 24.04 PC RIP: 0010:tipc_udp_nl_dump_remoteip+0x4a4/0x8f0 the ub was in fact a struct dev. when bid != 0 && skip_cnt != 0, bearer_list[bid] may be NULL or other media when other thread changes it. fix this by checking media_id. Fixes: 832629c ("tipc: add UDP remoteip dump to netlink API") Signed-off-by: Haixia Qu <[email protected]> Reviewed-by: Tung Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
[ Upstream commit f82727a ] The reproduction steps: 1. create a tun interface 2. enable l2 bearer 3. TIPC_NL_UDP_GET_REMOTEIP with media name set to tun tipc: Started in network mode tipc: Node identity 8af312d38a21, cluster identity 4711 tipc: Enabled bearer <eth:syz_tun>, priority 1 Oops: general protection fault KASAN: null-ptr-deref in range CPU: 1 UID: 1000 PID: 559 Comm: poc Not tainted 6.16.0-rc1+ #117 PREEMPT Hardware name: QEMU Ubuntu 24.04 PC RIP: 0010:tipc_udp_nl_dump_remoteip+0x4a4/0x8f0 the ub was in fact a struct dev. when bid != 0 && skip_cnt != 0, bearer_list[bid] may be NULL or other media when other thread changes it. fix this by checking media_id. Fixes: 832629c ("tipc: add UDP remoteip dump to netlink API") Signed-off-by: Haixia Qu <[email protected]> Reviewed-by: Tung Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
JIRA: https://issues.redhat.com/browse/RHEL-96627 Tested: compile only commit f82727a Author: Haixia Qu <[email protected]> Date: Tue Jun 17 05:56:24 2025 +0000 tipc: fix null-ptr-deref when acquiring remote ip of ethernet bearer The reproduction steps: 1. create a tun interface 2. enable l2 bearer 3. TIPC_NL_UDP_GET_REMOTEIP with media name set to tun tipc: Started in network mode tipc: Node identity 8af312d38a21, cluster identity 4711 tipc: Enabled bearer <eth:syz_tun>, priority 1 Oops: general protection fault KASAN: null-ptr-deref in range CPU: 1 UID: 1000 PID: 559 Comm: poc Not tainted 6.16.0-rc1+ #117 PREEMPT Hardware name: QEMU Ubuntu 24.04 PC RIP: 0010:tipc_udp_nl_dump_remoteip+0x4a4/0x8f0 the ub was in fact a struct dev. when bid != 0 && skip_cnt != 0, bearer_list[bid] may be NULL or other media when other thread changes it. fix this by checking media_id. Fixes: 832629c ("tipc: add UDP remoteip dump to netlink API") Signed-off-by: Haixia Qu <[email protected]> Reviewed-by: Tung Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Xin Long <[email protected]>
JIRA: https://issues.redhat.com/browse/RHEL-115652 Tested: compile only commit f82727a Author: Haixia Qu <[email protected]> Date: Tue Jun 17 05:56:24 2025 +0000 tipc: fix null-ptr-deref when acquiring remote ip of ethernet bearer The reproduction steps: 1. create a tun interface 2. enable l2 bearer 3. TIPC_NL_UDP_GET_REMOTEIP with media name set to tun tipc: Started in network mode tipc: Node identity 8af312d38a21, cluster identity 4711 tipc: Enabled bearer <eth:syz_tun>, priority 1 Oops: general protection fault KASAN: null-ptr-deref in range CPU: 1 UID: 1000 PID: 559 Comm: poc Not tainted 6.16.0-rc1+ #117 PREEMPT Hardware name: QEMU Ubuntu 24.04 PC RIP: 0010:tipc_udp_nl_dump_remoteip+0x4a4/0x8f0 the ub was in fact a struct dev. when bid != 0 && skip_cnt != 0, bearer_list[bid] may be NULL or other media when other thread changes it. fix this by checking media_id. Fixes: 832629c ("tipc: add UDP remoteip dump to netlink API") Signed-off-by: Haixia Qu <[email protected]> Reviewed-by: Tung Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Xin Long <[email protected]>
CVE-2023-4921
VULN-6729
Problem
https://www.cve.org/CVERecord?id=CVE-2023-4921
Solution
A single commit was identified as a fix for this issue: 8fc134fee27f2263988ae38920bc03da416b03d8
kABI check: omitted (unstable ABI of RT kernels)
Boot test: passed
See Specific tests for implied boot test passing.
Kselftests: passed relative
Methodology
Source-compiled (
e1a9851e6068a6ef800ec6b2b48a7c243882ed06
) kselftests suite was used.Coverage (including tests skipped during execution)
android
,breakpoints
,capabilities
,core
,cpu-hotplug
,cpufreq
,efivarfs
,exec
,filesystems
,firmware
,fpu
,ftrace
,futex
,gpio
,intel_pstate
,ipc
,kcmp
,kvm
,lib
,livepatch
,membarrier
,memfd
,memory-hotplug
,mount
,net
,net/forwarding
,net/mptcp
,netfilter
,nsfs
,proc
,pstore
,ptrace
,rseq
,rtc
,sgx
,sigaltstack
,size
,splice
,static_keys
,sync
,sysctl
,tc-testing
,timens
,timers
,tpm2
,user
,vm
,x86
,zram
Reference
ciqlts8_6-rt
(e71882f570f197b48b8bc1083d12926abbbac9c1
)Three test runs were conducted on the reference kernel.
kselftests–src–ciqlts8_6-rt–run1.log
kselftests–src–ciqlts8_6-rt–run2.log
kselftests–src–ciqlts8_6-rt–run3.log
Patch
Two test runs were conducted on the patched kernel.
kselftests–src–ciqlts8_6-rt-CVE-2023-4921–run1.log
kselftests–src–ciqlts8_6-rt-CVE-2023-4921–run2.log
Comparison
Overview of the results, reduced to the differences:
Tests
kvm:hardware_disable_test
,net/mptcp:mptcp_join.sh
show inconsistent behavior within the reference kernel testing itself. Thenet:gro.sh
test was known to give inconsistent results before, like for many other Rocky versions, even if it consistently failed in this testing batch.Additionally, the
netfilter:nft_trans_stress.sh
test was identified to be problematic: it may hang the machine, potentially indefinitely. It was interrupted after 10 minutes and was responsible for reduced scope ofkselftests--src--ciqlts8_6-rt--run3.log
test.Specific tests: passed
Bug replication
The bug can be replicated with the following commands, as mention in the commit's message:
The tests were performed on the referential and patched kernel.
Prerequisites
The
tc
commands above require the following kernel options to be enabled:CONFIG_NET_SCHED
,CONFIG_NET_SCH_INGRESS
,CONFIG_NET_SCH_QFQ
,CONFIG_NET_SCH_PLUG
,CONFIG_NET_CLS_BASIC
.All of them are enabled by default in the
configs/kernel-rt-4.18.0-x86_64.config
configuration file for the testedx86_64
platform.Reference
ciqlts8_6-rt
(e71882f570f197b48b8bc1083d12926abbbac9c1
)Bug replicated successfully. Kernel crashed and machine automatically rebooted, although no crash report was observed as in the non-RT LTS 8.6 version.
Full log:
bug-replication–ciqlts8_6-rt.log
Patch
Patch efficacy verified successfully. Repeating the steps doesn't result in a crash and the machine doesn't reboot.
Full log:
bug-replication–ciqlts8_6-rt-CVE-2023-4921.log
A warning can be observed
issued at sch_qfq.c. The work-conserving queue discipline is a qdisc which never leaves the outbound interface in idle unless the queue is empty, while non-work-conserving qdiscs may delay packets, for example to shape the traffic1. The
plug
qdisc, being the leaf in the hierarchical qdiscs configuration used, is non-work-conserving, as it is able to suspend the packet flow. Multiple types of qdiscs are apparently not designed to work with non-work-conserving child qdiscs and issue similar warnings onskb == NULL
condition: ets, hfsc, htb, drr. See also the message in b00355db3f88d96810a60011a30cfb2c3469409dand in 6d25d1dc76bf5943a5c1f4bb74d66d5eac58eb77, which includes
qfq
to the group aboveThe point is: the warning is related to the way the qdisc hierarchy has been defined in the bug replication script and not to the introduced changes.
Footnotes
1 https://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.qdisc.terminology.html