-
Notifications
You must be signed in to change notification settings - Fork 28
Single-Hart O(1) Enhancement #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously, the scheduler performed a linear search through the global task list (kcb->tasks) to find the next TASK_READY task. This approach limited scalability as the search iterations increased with the number of tasks, resulting in higher scheduling latency. To support an O(1) scheduler and improve extensibility, a sched_t structure is introduced and integrated into kcb. The new structure contains: - ready_queues: Holds all runnable tasks, including TASK_RUNNING and TASK_READY. The scheduler selects tasks directly from these queues. - ready_bitmap: Records the state of each ready queue. Using the bitmap, the scheduler can locate the highest-priority runnable task in O(1) time complexity. - rr_cursors: Round-robin cursors that track the next task node in each ready queue. Each priority level maintains its own RR cursor. The top priority cursor is assigned to kcb->task_current, which is advanced circularly after each scheduling cycle. - hart_id: Identifies the scheduler instance per hart (0 for single-hart configurations). - task_idle: The system idle task, executed when no runnable tasks exist. In the current design, kcb binds only one sched_t instance (hart0) for single-hart systems, but this structure can be extended for multi-hart scheduling in the future.
Previously, the list operation for removal was limited to list_remove(), which immediately freed the list node during the function call. When removing a running task (TASK_RUNNING), the list node in the ready queue must not be freed because kcb->task_current shares the same node. This change introduces list_unlink(), which detaches the node from the list without freeing it. The unlinked node is returned to the caller, allowing safe reuse and improving flexibility in dequeue operations. This API will be applied in sched_dequeue_task() for safely removing tasks from ready queues.
When a task is enqueued into or dequeued from the ready queue, the bitmap that indicates the ready queue state should be updated. These three marcos can be used in mo_task_dequeue() and mo_task_enqueue() APIs to improve readability and maintain consistency.
Previously, sched_enqueue_task() only changed task state without inserting into ready queue. As a result, the scheduler could not select enqueued task for execution. This change pushes the task into the appropriate ready queue using list_pusback(), and initializes realated attribution such as the ready bitmap and RR cursor. The ready queue for corresponging task priority will be initialized at this enqueue path and never be released afterward. With this updated API, tasks can be enqueued into the ready queue and selected by cursor-based O(1) scheduler.
Previously, mo_task_dequeue() was only a stub and returned immediately without performing any operation. As a result, tasks remained in the ready queue after being dequeued, leading to potential scheduler inconsistencies. This change implements the full dequeue process: - Searches for the task node in the ready queue by task ID. - Maintains RR cursor consistency: the RR cursor should always point to a valid task node in the ready queue. When removing a task node, the cursor is advanced circularly to the next node. - Unlinks the task node using list_unlink(), which removes the node from the ready queue without freeing it. list_unlink() is used instead of list_remove() to avoid accidentally freeing kcb->task_current when the current running task is dequeued. - Updates and checks queue_counts: if the ready queue becomes empty, the RR cursor is set to NULL and the bitmap is cleared until a new task is enqueued.
Previously, mo_task_spawn() only created a task and appended it to the global task list (kcb->tasks), assigning the first task directly from the global list node. This change adds a call to sched_enqueue_task() within the critical section to enqueue the task into the ready queue and safely initialize its scheduling attributes. The first task assignment is now aligned with the RR cursor mechanism to ensure consistency with the O(1) scheduler.
Previously, the scheduler iterated through the global task list (kcb->tasks) to find the next TASK_READY task, resulting in O(N) selection time. This approach limited scalability and caused inconsistent task rotation under heavy load. The new scheduling process: 1. Check the ready bitmap and find the highest priority level. 2. Select the RR cursor node from the corresponding ready queue. 3. Advance the selected cursor node circularly. Why RR cursor instead of pop/enqueue rotation: - Fewer operations on the ready queue: compared to the pop/enqueue approach, which requires two function calls per switch, the RR cursor method only advances one pointer per scheduling cycle. - Cache friendly: always accesses the same cursor node, improving cache locality on hot paths. - Cycle deterministic: RR cursor design allows deterministic task rotation and enables potential future extensions such as cycle accounting or fairness-based algorithms. This change introduces a fully O(1) scheduler design based on per-priority ready queues and round-robin (RR) cursors. Each ready queue maintains its own cursor, allowing the scheduler to select the next runnable task in constant time.
Previously, mo_task_suspend() only changed the task state to TASK_SUSPENDED without removing the task from the ready queue. As a result, suspended tasks could still be selected by the scheduler, leading to incorrect task switching and inconsistent queue states. This change adds a dequeue operation to remove the corresponding task node from its ready queue before marking it as suspended. Additionally, the condition to detect the currently running task has been updated: the scheduler now compares the TCB pointer (kcb->task_current->data == task) instead of the list node (kcb->task_current == node), since kcb->task_current now stores a ready queue node rather than a global task list node. If the suspended task is currently running, the CPU will yield after the task is suspended to allow the scheduler to select the next runnable task. This ensures that suspended tasks are no longer visible to the scheduler until they are resumed.
Previously, mo_task_cancel() only removed the task node from the global task list (kcb->tasks) but did not remove it from the ready queue. As a result, the scheduler could still select a canceled task that remained in the ready queue. Additionally, freeing the node twice could occur because the same node was already freed after list_remove(), leading to a double-free issue. This change adds a call to sched_dequeue_task() to remove the task from the ready queue, ensuring that once a task is canceled, it will no longer appear in the scheduler’s selection path. This also prevents memory corruption caused by double-freeing list nodes.
Previously, mo_task_resume() only changed resumed task state to TASK_READY, but didn't enqueue it into ready queue. As a result, the scheduler could not select the resumed task for execution. This change adds sched_enqueue_task() to insert the resumed task into the appropriate ready queue and update the ready bitmap, ensuring the resumed task becomes schedulable again.
Previously, mo_task_wakeup() only changed the task state to TASK_READY without enqueuing the task back into the ready queue. As a result, a woken-up task could remain invisible to the scheduler and never be selected for execution. This change adds a call to sched_enqueue_task() to insert the task into the appropriate ready queue based on its priority level. The ready bitmap, task counts of each ready queue, and RR cursor are updated accordingly to maintain scheduler consistency. With this update, tasks transitioned from a blocked or suspended state can be properly scheduled for execution once they are woken up.
This commit introduces a new API, sched_migrate_task(), which enables migration of a task between ready queues of different priority levels. The function safely removes the task from its current ready queue and enqueues it into the target queue, updating the corresponding RR cursor and ready bitmap to maintain scheduler consistency. This helper will be used in mo_task_priority() and other task management routines that adjust task priority dynamically. Future improvement: The current enqueue path allocates a new list node for each task insertion based on its TCB pointer. In the future, this can be optimized by directly transferring or reusing the existing list node between ready queues, eliminating the need for an additional malloc() and free() operations during priority migrations.
This change refactors the priority update process in mo_task_priority() to include early-return checks and proper task migration handling. - Early-return conditions: * Prevent modification of the idle task. * Disallow assigning TASK_PRIO_IDLE to non-idle tasks. The idle task is created by idle_task_init() during system startup and must retain its fixed priority. - Task migration: If the priority-changed task resides in a ready queue (TASK_READY or TASK_RUNNING), sched_migrate_task() is called to move it to the queue corresponding to the new priority. - Running task behavior: When the current running task changes its own priority, it yields the CPU so the scheduler can dispatch the next highest-priority task.
This commit introduces the system idle task and its initialization API (idle_task_init()). The idle task serves as the default execution context when no other runnable tasks exist in the system. The sched_idle() function supports both preemptive and cooperative modes. In sched_t, a list node named task_idle is added to record the idle task sentinel. The idle task never enters any ready queue and its priority level cannot be changed. When idle_task_init() is called, the idle task is initialized as the first execution context. This eliminates the need for additional APIs in main() to set up the initial high-priority task during system launch. This design allows task priorities to be adjusted safely during app_main(), while keeping the scheduler’s entry point consistent.
When all ready queues are empty, the scheduler should switch to idle mode and wait for incoming interrupts. This commit introduces a dedicated helper to handle that transition, centralizing the logic and improving readbility of the scheduler path to idle.
Previously, when all ready queues were empty, the scheduler would trigger a kernel panic. This condition should instead transition into the idle task rather than panic. The new sched_switch_to_idle() helper centralizes this logic, making the path to idle clearer and more readable.
The idle task is now initialized in main() during system startup. This ensures that the scheduler always has a valid execution context before any user or application tasks are created. Initializing the idle task early guarantees a safe fallback path when no runnable tasks exist and keeps the scheduler entry point consistent.
This change sets up the scheduler state during system startup by assigning kcb->task_current to kcb->harts->task_idle and dispatching to the idle task as the first execution context. This commit also keeps the scheduling entry path consistent between startup and runtime.
Previously, both mo_task_spawn() and idle_task_init() implicitly bound their created tasks to kcb->task_current as the first execution context. This behavior caused ambiguity with the scheduler, which is now responsible for determining the active task during system startup. This change removes the initial binding logic from both functions, allowing the startup process (main()) to explicitly assign kcb->task_current (typically to the idle task) during launch. This ensures a single, centralized initialization flow and improves the separation between task creation and scheduling control.
Prepare for O(1) bitmap index lookup by adding a 32-entry De Bruijn sequence table. The table will be used in later commits to replace iterative bit scanning. No functional change in this patch.
Implement the helper function that uses a De Bruijn multiply-and-LUT approach to compute the index of the least-significant set bit in O(1) time complexity. This helper is not yet wired into the scheduler logic; integration will follow in a later commit. No functional change in this patch.
Replace the iterative bitmap scanning with the De Bruijn multiply+LUT method via the new helper. This change makes top-priority selection constant-time and deterministic.
Previously, _sched_block() only enqueued the task into the wait queue and set its state to TASK_BLOCKED. In the new scheduler design (ready-queue–based), a blocked task must also be removed from its priority's ready queue to prevent it from being selected by the scheduler. This change adds the missing dequeue path for the corresponding ready queue, ensuring behavior consistency.
Previously, sched_wakeup_task() was limited to internal use within the scheduler module. This change makes it globally visible so that it can be reused in semaphore.c for task wake-up operations.
Previously, mo_sem_signal() only changed the awakened task state to TASK_READY when a semaphore signal was triggered. In the new scheduler design, which selects runnable tasks from ready queues, the awakened task must also be enqueued for scheduling. This change invokes sched_wakeup_task() to perform the enqueue operation, ensuring the awakened task is properly inserted into the ready queue.
Previously, mo_task_delay() only set TASK_BLOCKED and updated delayed ticks. In the new ready-queue-based scheduler, delayed tasks must also be removed from the ready queue. This change calls sched_dequeue_task() in mo_task_delay() so that the task is properly dequeued from its priority ready queue when it is delayed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New O(1) time complexity scheduler
This PR introduces a new O(1) priority-based scheduler that replaces the original O(n) round-robin scheduler. The previous design scanned the global task list linearly to select the next runnable task (TASK_READY), which became a bottleneck as the number of tasks increased and did not support task selection on different task priorities.
Changes
The following diagrams illustrate the differences between the original and new schedulers in this PR.
Original scheduler
As shown in the figure, the original scheduler selected a new task based on its state. Once a runnable task is found, it would be updated to the
task_currentand context switched.This linear search introduces a significant performance issue in the scheduler, especially when the number of runnable tasks increases. The original scheduler iterates over the task list circularly, but because it cannot guarantee that all tasks are visited safely, the iteration count is capped with an artificial limit (
IMAX = 500).New scheduler design in this PR
The new scheduler introduces a
sched_tstructure that provides constant-time (O(1)) tracking and selection of runnable tasks. The main components are:Bitmap (
bitmap)A compact bitmask where each bit (0–7) represents one priority level (from bit 0 critical to bit 7 idle). A bit is set when at least one task of the corresponding priority is runnable. This enables O(1) identification of the highest runnable priority via a De Bruijn–based least-significant-bit (LSB) helper.
Ready queues (
ready_queue[])An array of per-priority linked lists. Each ready queue contains only the runnable tasks of its priority level. Blocked, suspended, or delayed tasks are removed from this list; waking up or resuming a task re-enqueues it.
Round-robin cursor (
rr_cursor[])For each priority level, an RR cursor tracks the next task in the corresponding ready queue for round-robin scheduling among tasks of the same priority.
New task selection logic in this PR
The new scheduler selects the next task in three main steps:
Find the highest runnable priority from the bitmap
The scheduler uses a De Bruijn–based helper on the
bitmapto obtain the index of the highest-priority runnable level in O(1) time.Pick the next task via the round-robin cursor
For that priority level, the scheduler reads the corresponding
rr_cursor, which points to the next runnable task in the ready queue, and assigns it totask_current.Advance the round-robin cursor
After selecting the task, the
rr_cursoris advanced to the next node in the ready queue (wrapping around when reaching the end), preserving round-robin scheduling among tasks of the same priority.With this design, the scheduler no longer scans the entire task list. Instead, it uses the bitmap plus per-priority cursors to achieve deterministic O(1) task selection while still providing fairness within each priority level.
Features
The new scheduler includes the following features:
O(1) priority-based scheduler
Strict priority scheduling policy
Default idle task (sysmem)
Implementation detail
This PR introduces an O(1) scheduler by refactoring the internal scheduling logic and reorganizing task management around a new data structure
sched_t. Thesched_tinstance contains three key components: a bitmap that tracks which priority levels contain runnable tasks, an array of per-priority ready queues, and round-robin cursors used to determine the next task within each priority level. All enqueue and dequeue operations now funnel through unified helpers (sched_enqueue_task()andsched_dequeue_task()), ensuring consistent updates to both the bitmap and ready queues. Task state transitions were updated accordingly: when a task becomes blocked, suspended, delayed, or cancelled, it is removed from its ready queue; when it becomes runnable again, it is reinserted into the appropriate queue. The scheduler's main selection function now uses the bitmap and a De Bruijn–based LSB helper to perform constant-time priority lookup, then reads and advances the per-priority round-robin cursor to select the next task. The idle task is handled specially: it is not placed in any ready queue and is selected only when all bitmap bits are clear. No changes were made to the global task list structure or the task state model; only the scheduling backend has been redesigned to provide deterministic behavior and strict priority semantics.Task state transition
The task state transition is the same as the original scheduler; only the ready queue dequeue/enqueue path is added in the new scheduler.
All tasks with states

TASK_READYandTASK_RUNNINGare in the ready queue of the corresponding task priority. Leave/enter above task states will dequeue/enqueue from/into the ready queue.Validation
1. Backward compatible
All applications under the
app/directory have been executed and verified to run correctly without modification. No functional regressions were observed when switching from the original scheduler to the new O(1) scheduler.2. Unit test
This unit test focuses on verifying the consistency of the bitmap and the O(1) task count tracing maintained in
sched_tduring task state transitions and priority changes.Approach
TASK_PRIO_CRITto orchestrate the entire test process and ensure deterministic sequencing.Task types
TASK_BLOCKEDthroughmo_task_delay(), allowing verification of dequeue behavior and ready-queue updates.Verified state points
The bitmap and task count will be verified after the following actions.
Normal task state transitions
TASK_READY).TASK_READY → TASK_SUSPEND).TASK_SUSPEND → TASK_READY).TASK_READY → TASK_CANCELLED), ensuring it is removed from ready queues and no bitmap bits remain set.Blocked task behavior (
TASK_RUNNING → TASK_BLOCKED)TASK_READY).mo_task_delay(), transitions intoTASK_BLOCKED, and the controller resumes execution. The test verifies that the blocked task is fully removed from the ready queue and its priority bit is cleared in the bitmap.Expected results
All state transitions maintain consistent bitmap states, correct ready-queue membership, and accurate per-priority task count tracking. No unexpected runnable tasks appear, and no ready-queue entries persist after a task transitions to BLOCKED, SUSPENDED, or CANCELLED.
Test result
Note
TASK_CANCELLEDin this document is used only for explanation. It is not an actual state in the task state machine, but represents the condition where a task has been removed from all scheduling structures and no longer exists in the system.(TASK_READY)) refer to the state of the test tasks being created or manipulated, not the state of the controller task.Implementation reference
3. Benchmark
The benchmark compares the original O(n) scheduler with the new O(1) scheduler under multiple task-load scenarios. Each scenario measures the average scheduling latency observed in QEMU using the existing benchmarking framework.
Test suits
Benchmark methodology
Scenarios
The benchmark covers the following scenarios:
Test results
Future works
Notes
The draft PR #23 has been closed.