Skip to content

Commit e637731

Browse files
Zheng Qixingkawasaki
authored andcommitted
blk-cgroup: skip dying blkg in blkcg_activate_policy()
When switching IO schedulers on a block device, blkcg_activate_policy() can race with concurrent blkcg deletion, leading to a use-after-free of the blkg. T1: T2: elv_iosched_store blkg_destroy elevator_switch kill(&blkg->refcnt) // blkg->refcnt=0 ... blkg_release // call_rcu blkcg_activate_policy __blkg_release list for blkg blkg_free blkg_free_workfn ->pd_free_fn(pd) blkg_get(blkg) // blkg->refcnt=0->1 list_del_init(&blkg->q_node) kfree(blkg) blkg_put(pinned_blkg) // blkg->refcnt=1->0 blkg_release // call_rcu again call_rcu(..., __blkg_release) Fix this by replacing blkg_get() with blkg_tryget(), which fails if the blkg's refcount has already reached zero. If blkg_tryget() fails, skip processing this blkg since it's already being destroyed. The uaf call trace is as follows: ================================================================== BUG: KASAN: slab-use-after-free in rcu_accelerate_cbs+0x114/0x120 Read of size 8 at addr ffff88815a20b5d8 by task bash/1068 CPU: 0 PID: 1068 Comm: bash Not tainted 6.6.0-g6918ead378dc-dirty #31 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <IRQ> rcu_accelerate_cbs+0x114/0x120 rcu_report_qs_rdp+0x1fb/0x3e0 rcu_core+0x4d7/0x6f0 handle_softirqs+0x198/0x550 irq_exit_rcu+0x130/0x190 sysvec_apic_timer_interrupt+0x6e/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x16/0x20 Allocated by task 1031: kasan_save_stack+0x1c/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x8b/0x90 blkg_alloc+0xb6/0x9c0 blkg_create+0x8c6/0x1010 blkg_lookup_create+0x2ca/0x660 bio_associate_blkg_from_css+0xfb/0x4e0 bio_associate_blkg+0x62/0xf0 bio_init+0x272/0x8d0 bio_alloc_bioset+0x45a/0x760 ext4_bio_write_folio+0x68e/0x10d0 mpage_submit_folio+0x14a/0x2b0 mpage_process_page_bufs+0x1b1/0x390 mpage_prepare_extent_to_map+0xa91/0x1060 ext4_do_writepages+0x948/0x1c50 ext4_writepages+0x23f/0x4a0 do_writepages+0x162/0x5e0 filemap_fdatawrite_wbc+0x11a/0x180 __filemap_fdatawrite_range+0x9d/0xd0 file_write_and_wait_range+0x91/0x110 ext4_sync_file+0x1c1/0xaa0 __x64_sys_fsync+0x55/0x90 do_syscall_64+0x55/0x100 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Freed by task 24: kasan_save_stack+0x1c/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x27/0x40 __kasan_slab_free+0x106/0x180 __kmem_cache_free+0x162/0x350 process_one_work+0x573/0xd30 worker_thread+0x67f/0xc30 kthread+0x28b/0x350 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1b/0x30 Fixes: f1c006f ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()") Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
1 parent 4cdbcc5 commit e637731

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

block/blk-cgroup.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1645,9 +1645,10 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
16451645
* GFP_NOWAIT failed. Free the existing one and
16461646
* prealloc for @blkg w/ GFP_KERNEL.
16471647
*/
1648+
if (!blkg_tryget(blkg))
1649+
continue;
16481650
if (pinned_blkg)
16491651
blkg_put(pinned_blkg);
1650-
blkg_get(blkg);
16511652
pinned_blkg = blkg;
16521653

16531654
spin_unlock_irq(&q->queue_lock);

0 commit comments

Comments
 (0)