Skip to content

HIGHMEM violation by kmap in zfs_uiomove_bvec_impl #15668

@sempervictus

Description

@sempervictus

System information

Type Version/Name
Distribution Name arch
Distribution Version current
Kernel Version 6.6 (grsec)
Architecture x64
OpenZFS Version 2.2.2

Describe the problem you're observing

Upstream Linux doesn't guard memory nearly as well as Grsecurity, "allowing" kmaping bv_offset + skip > PAGE_SIZE under x86_64; this will likely crash under x86 however. The "hacked together in a few minutes fix" (by much better hackers) currently keeping my system alive changes the kmap and memcpys to use PAGE_MASK magic to ensure we don't violate the boundary; but there should probably be a zfs-correct solution to this situation as we're occasionally performing very low level memory operations "illegally" in kernels without enforcement for the constraint which will eventually occur (and may have impacts on other architectures).

Given that this bug was found in UIO memory management/movement code, i'm curious as to whether it poses any risk to data integrity if actually tripping a fault (and not crashing but losing an IO or segment thereof) under other runtime conditions since the read-back could similarly fail to kmap incorrectly.

Describe how to reproduce the problem

  1. Use a kernel which enforces kmap constraint for PAGE_SIZEd map targets (or write an assertion into the zfs_kmap* macro to do so).
  2. Perform a large send/recv on the same system with heavily fragmented ZVOLs (seems to be what tripped mine) or run a bunch of import/export passes for a pool containing such ZVOLs.
  3. Observe violation

Include any warning/errors/backtraces from the system logs

root@svl-arch00 ~]# zpool import dpool -t tpool -R /mnt/dst -N
[  155.381372][ T2338] BUG: unable to handle page fault for address: ffffffffff53d000 (fixmap[slot 706 local idx #31 cpu 1]+0x0/0x1000)
[  155.382255][ T2338] #PF: supervisor write access in kernel mode
[  155.382668][ T2338] #PF: error_code(0x0002) - not-present page
[  155.383092][ T2338] PGD 0xffff888002898ff8 0000000002ad1063 
[  155.383092][ T2338] P4D 0xffff888002898ff8 0000000002ad1063 
[  155.383092][ T2338] PUD 0xffff888002ad1ff8 0000000002bd5063 
[  155.383092][ T2338] PMD 0xffff888002bd5fd0 8000000002de7063 
[  155.383092][ T2338] PTE 0xffff888002de79e8 0000000000000000 0xffff888000000000
[  155.384988][ T2338] Oops: 0002 [#1] PREEMPT SMP
[  155.385281][ T2338] CPU: 2 PID: 2338 Comm: zvol Tainted: G                TN  6.6.6 #1 49413849afb8da965cdd4351c9c93e35cb1207f4
[  155.386001][ T2338] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  155.386636][ T2338] RIP: 0010:[<ffffffff824ebae9>] memcpy+0x9/0x20
[  155.387046][ T2338] Code: 00 c3 cc 0f 1f 40 00 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 48 b8 00 00 00 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 d1 <f3> a4 48 0f ba 2c 24 3f c3 cc 0f 1f 40 00 66 0f 1f 84 00 00 00 00
[  155.388274][ T2338] RSP: 0000:ffffc90002843c68 EFLAGS: 00010246
[  155.388659][ T2338] RAX: ffffffffff53c000 RBX: ffffc90002843dd8 RCX: 0000000000001000
[  155.389161][ T2338] RDX: 0000000000002000 RSI: ffff888160833000 RDI: ffffffffff53d000
[  155.389667][ T2338] RBP: ffffc90002843d40 R08: 0000000000000000 R09: ffff888160832000
[  155.390166][ T2338] R10: ffffc90002843d58 R11: ffff888162152728 R12: c0fa228c307599a0
[  155.390693][ T2338] R13: 0000000000002000 R14: 0000000000000000 R15: ffff8881591ca040
[  155.391194][ T2338] RAX: fixmap[slot 707 local idx #0 cpu 2]+0x0/0x1000
[  155.391626][ T2338] RBX: vmalloc[copy_process]+0x683/0x3880
[  155.391999][ T2338] RSI: zio_buf_comb_8192+0x1000/0x2000 [slab object]
[  155.392418][ T2338] RDI: fixmap[slot 706 local idx #31 cpu 1]+0x0/0x1000
[  155.392844][ T2338] RBP: vmalloc[copy_process]+0x683/0x3880
[  155.393200][ T2338] RSP: vmalloc[copy_process]+0x683/0x3880
[  155.393547][ T2338] R09: zio_buf_comb_8192+0x0/0x2000 [slab object]
[  155.393955][ T2338] R10: vmalloc[copy_process]+0x683/0x3880
[  155.394328][ T2338] R11: zio_cache+0x0/0x500 [slab object]
[  155.394687][ T2338] R15: autoslab_size_M_fops_P_block_fops_62_10_62_10_S_512_A_8_n_11+0x40/0x200 [slab object]
[  155.395331][ T2338] FS:  0000000000000000(0000) GS:ffff88823d400000(0000) knlGS:0000000000000000
[  155.395893][ T2338] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  155.396289][ T2338] CR2: ffffffffff53d000 CR3: 0000000002898004 CR4: 0000000000760ef0 shadow CR4: 0000000000760ef0
[  155.396948][ T2338] CR2: fixmap[slot 706 local idx #31 cpu 1]+0x0/0x1000
[  155.397365][ T2338] PKRU: 55555554
[  155.397583][ T2338] ASID: 0003
[  155.397786][ T2338] Stack:
[  155.397963][ T2338]  ffffffff81942409 3f05dd73b10c3321 3f05dd73b10c257c 0000000000000000
[  155.398476][ T2338]  0000000000000010 0000000000002000 0000000000000000 ffff888120eb91c0
[  155.398982][ T2338]  ffff888120eb9288 0000000000002000 0000000000000000 0000000000002000
[  155.399504][ T2338] Call Trace:
[  155.399711][ T2338]  <TASK>
[  155.400045][ T2338]  [<ffffffff81942409>] zfs_uiomove_bvec_impl+0x159/0x430 ffffc90002843c68
[  155.400704][ T2338]  [<ffffffff8179bcdc>] ? dmu_read_uio_dnode+0xdc/0x170 ffffc90002843d48
[  155.401340][ T2338]  [<ffffffff8179bcdc>] dmu_read_uio_dnode+0xdc/0x170 ffffc90002843d50
[  155.401950][ T2338]  [<ffffffff8197147d>] zvol_read+0x14d/0x550 ffffc90002843da8
[  155.402514][ T2338]  [<ffffffff819718a1>] zvol_read_task+0x21/0x80 ffffc90002843e40
[  155.403099][ T2338]  [<ffffffff81668e90>] taskq_thread+0x3c0/0x780 ffffc90002843e58
[  155.403689][ T2338]  [<ffffffff8121da30>] ? try_to_wake_up+0x580/0x580 ffffc90002843eb8
[  155.404295][ T2338]  [<ffffffff81971880>] ? zvol_read+0x550/0x550 ffffc90002843f30
[  155.404891][ T2338]  [<ffffffff812037cd>] kthread+0x17d/0x200 ffffc90002843f80
[  155.405479][ T2338]  [<ffffffff81668ad0>] ? taskq_thread_spawn+0xa0/0xa0 ffffc90002843f88
[  155.406094][ T2338]  [<ffffffff81203650>] ? kthread_complete_and_exit+0x40/0x40 ffffc90002843f90
[  155.406740][ T2338]  [<ffffffff8114b238>] ret_from_fork+0x98/0xc0 ffffc90002843fb8
[  155.407313][ T2338]  [<ffffffff81203650>] ? kthread_complete_and_exit+0x40/0x40 ffffc90002843fc0
[  155.407965][ T2338]  [<ffffffff8110737f>] ret_from_fork_asm+0x3d/0xbe ffffc90002843fe0
[  155.408864][ T2338]  </TASK>

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions