-
Notifications
You must be signed in to change notification settings - Fork 63
Add coroutine-based SMP support with WFI #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 6 files
Prompt for AI agents (all 2 issues)
Understand the root cause of the following 2 issues and fix them.
<file name="main.c">
<violation number="1" location="main.c:962">
Fatal VM errors triggered inside hart_exec_loop now return success to the caller because the SMP path always returns 0 after the loop even when emu->stopped was set by vm_error_report. Please propagate a non-zero error code like the single-hart path does so crashes aren’t silently ignored.</violation>
</file>
<file name="coro.c">
<violation number="1" location="coro.c:290">
Reset coro_state.current_hart to CORO_HART_ID_IDLE when the coroutine leaves so callers don’t see a stale hart ID.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
|
This looks interesting, but wouldn’t using a real thread be faster than a coroutine for HART emulation? |
Before refining the internal structure toward multiple harts, let’s stick to single-threaded emulation as early versions of QEMU did. |
5b2a16d to
67fa2c5
Compare
I see, I think this is an interesting one to follow. |
This implements cooperative multitasking for multi-hart systems using coroutines, enabling efficient SMP emulation with significant CPU usage reduction. - WFI instruction callback mechanism for power management - CPU usage optimization: ~90% reduction in idle systems - Maximum latency: 1ms (acceptable for typical 10ms timer interrupts)
Previous implementation used usleep(1000) busy-wait loop in SMP mode, causing high CPU usage (~100%) even when all harts were idle in WFI. This commit implements platform-specific event-driven wait mechanisms: Linux implementation: - Use timerfd_create() for 1ms periodic timer - poll() on timerfd + UART fd for blocking wait - Consume timerfd events to prevent accumulation - Reduces CPU usage from ~100% to < 2% macOS implementation: - Use kqueue() for event multiplexing - EVFILT_TIMER for 1ms periodic wakeup - Blocks on kevent() when all harts in WFI - Reduces CPU usage from ~100% to < 2% Benefits: - Dramatic CPU usage reduction (> 98%) on both platforms - Zero latency for UART input (event-driven vs. polling) - Maintains 1ms responsiveness for timer interrupts - Event-based architecture easier to extend Tested on Linux with timerfd - 4-core boot succeeds, CPU < 2% Tested on macOS with kqueue - 4-core boot succeeds, CPU < 2% Note: UART input relies on u8250_check_ready() polling in periodic update loop. Direct fd monitoring removed from macOS implementation as kqueue does not support TTY file descriptors.
This moves peripheral polling into the coroutine loop, so SMP runs keep same cadence as the single-core path, preventing delayed device IRQs. It also clears the published coroutine hart id when yielding to avoid exposing stale scheduler state to callers.
This implements cooperative multitasking for multi-hart systems using coroutines, enabling efficient SMP emulation with significant CPU usage reduction.
Summary by cubic
Adds coroutine-based cooperative SMP for multi-hart emulation. Harts yield on WFI to cut idle CPU usage by ~90%, with up to 1ms latency.
New Features
Bug Fixes