Skip to content

Intermittent kernel panic in CapSet.InheritableBounds: sleep/reclaim path triggered from atomic context #1803

@fslongjin

Description

@fslongjin

Summary

Running DragonOS c_unitest for capability syscalls can intermittently panic during:

  • CapSet.InheritableBounds

Panic shows schedule() assertion failure (preempt_count != 0) and the stack indicates scheduler entry from a memory/page-fault/reclaim path while in an atomic context.

Reproduced at commit: 2f6e86f1

Test Context

c_unitest output (stable most of time, panic is intermittent):

[----------] 7 tests from CapSet
[ RUN      ] CapSet.EffectiveMustBeSubsetOfPermitted
[       OK ] CapSet.EffectiveMustBeSubsetOfPermitted (0 ms)
[ RUN      ] CapSet.VersionPaths
[       OK ] CapSet.VersionPaths (0 ms)
[ RUN      ] CapSet.InvalidVersionWithData
[       OK ] CapSet.InvalidVersionWithData (0 ms)
[ RUN      ] CapSet.NegativePid
[       OK ] CapSet.NegativePid (0 ms)
[ RUN      ] CapSet.NonCurrentPid
[       OK ] CapSet.NonCurrentPid (0 ms)
[ RUN      ] CapSet.PermittedNotIncrease
[       OK ] CapSet.PermittedNotIncrease (77 ms)
[ RUN      ] CapSet.InheritableBounds

Panic Details

[ ERROR ] (src/debug/panic/mod.rs:43)   Kernel Panic Occurred. raw_pid: 0
Location:
    File: src/sched/mod.rs
    Line: 850, Column: 5
Message:
    assertion `left == right` failed
      left: 1
     right: 0
Rust Panic Backtrace:
[1]  _Unwind_Backtrace
[2]  dragonos_kernel::debug::panic::hook::print_stack_trace
[3]  __rustc::rust_begin_unwind
[4]  core::panicking::panic_fmt
[5]  core::panicking::assert_failed_inner
[6]  core::panicking::assert_failed
[7]  dragonos_kernel::sched::schedule
[8]  dragonos_kernel::libs::wait_queue::block_current_impl
[9]  dragonos_kernel::libs::wait_queue::WaitQueue::wait_until_impl
[10] dragonos_kernel::mm::page::PageManager::get
[11] dragonos_kernel::mm::ucontext::LockedVMA::unmap
[12] <dragonos_kernel::mm::ucontext::InnerAddressSpace as core::ops::drop::Drop>::drop
[13] alloc::sync::Arc<T,A>::drop_slow
[14] dragonos_kernel::sched::__schedule
[15] x86_64_do_irq
[16] Restore_all

Expected Behavior

  • capset test cases should pass consistently.
  • No scheduler assertion (preempt_count mismatch).

Actual Behavior

  • Intermittent panic in/around CapSet.InheritableBounds.
  • schedule() is reached while preempt_count == 1.

Suspected Root Cause

The issue is likely not capset semantics itself, but context safety:

  1. In an atomic/irq-off or lock-held path, code reaches an Arc drop chain.
  2. Drop path enters memory reclaim/fault path (PageManager::get / VMA::unmap / AddressSpace::drop).
  3. Reclaim/fault path attempts to block/schedule.
  4. schedule() asserts because current context is non-preemptible (preempt_count != 0).

In short: a potentially sleeping release path is reached from atomic context.

Why this is tricky

  • cap/cred objects are hot-path data and may be observed in trap/fault/scheduler-related paths.
  • Replacing cap lock with sleeping primitives (Mutex/RwSem) may violate non-sleepable context constraints in some call chains.
  • Need Linux-compatible semantics while preserving atomic-context safety.

Scope / Constraints

  • Must align behavior with Linux 6.6 semantics.
  • Must avoid workaround-style masking of panic; fix should remove atomic-context sleep/reclaim hazard at source.
  • Must not introduce regressions in scheduling/context-switch fast path.

Reproduction Notes

  • Trigger by repeatedly running capability-related c_unitest suite, especially CapSet.InheritableBounds.
  • Panic is intermittent; stress/repeat loops increase hit probability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions