Commit cff2907
committed
Replace CFS/EEVDF with compact O(1) tiny scheduler
Linux 7.0's kernel/sched/fair.c carries no #ifdef CONFIG_SMP guard.
On a UP NOMMU image the SMP load-balancer (select_task_rq_fair 1,484,
sched_balance_rq 1,460, update_sd_lb_stats 912,
sched_balance_find_*_group 1,262, _nohz_idle_balance 424,
can_migrate_task 324, active_load_balance_cpu_stop 316, ~7.8KB total)
gets pinned by the sched_class callback table; --gc-sections cannot
reach it through the table. Add the same kind of out-of-tree gate
0012/0013 used for debug.c and deadline.c, but for the whole class.
CONFIG_SCHED_FAIR_TINY (default n) wraps fair.c body in #ifndef and
provides a three-priority O(1) class in the #else branch:
- per-CPU bitmap + per-priority FIFO (HIGH/NORMAL/LOW)
- O(1) pick: find_first_bit(active) + list_first_entry
- O(1) enqueue: list_add_tail + __set_bit
- O(1) dequeue: list_del_init + __clear_bit when queue empties
- cross-priority preemption at wakeup; round-robin within a
priority via a fixed jiffies time-slice reset on set_next_task
Priority is a pure function of nice value: nice<0 -> HIGH,
nice==0 -> NORMAL, nice>0 -> LOW; SCHED_IDLE collapses to LOW;
SCHED_BATCH uses nice normally. Tasks chain through the existing
&p->se.group_node (dead under !FAIR_GROUP_SCHED) so task_struct
stays unchanged. The bucket index is recomputed from p->static_prio
on every callback; core.c's dequeue-modify-enqueue protocol (verified
across all four static_prio mutation sites: sched_fork at 4650/4653,
syscalls.c set_user_nice at 84 for RT/DL and at 89 for fair via
scoped_guard(sched_change, ...)) keeps the value stable across the
removal/insertion bracket. RT preemption is unchanged: rt_sched_class
still preempts fair via the existing class chain walk in
pick_next_task_balance.
Not the historical 2.6 O(1) scheduler -- no active/expired arrays,
no interactivity estimator (the gameable heuristic that motivated
CFS), no priority recalculation. Just the priority bitmap + FIFO
data structure that O(1) got right, without the policy machinery that
O(1) got wrong.
The #else branch re-exports every symbol other TUs depend on:
update_curr_common (rt.c / deadline-class stub / stop_task.c / ext.c
runtime accounting), init_cfs_rq, fair_server_init,
init_sched_fair_class, sched_init_granularity, update_max_interval,
init_entity_runnable_average, post_init_entity_util_avg,
sched_balance_trigger, nohz_balance_{enter,exit}_idle,
nohz_run_idle_balance, update_group_capacity, __setparam_fair,
arch_asym_cpu_priority, plus sysctl_sched_base_slice and
sysctl_sched_migration_cost storage. switched_to_fair and
prio_changed_fair filter on rq->donor->sched_class != fair_sched_class
to avoid spurious resched_curr when the runner is RT (mirrors mainline
behavior; the kernel dispatches both hooks unconditionally).
pelt.c is left untouched. With fair.c gated, its CFS-side entry
points lose their callers; rt.c keeps update_rt_rq_load_avg live.
The remaining PELT symbols are non-static so LTO largely cannot strip
them, but the cost is small (1.7KB).
Build wiring: build.sh adds the 0014 patch to the apply glob, sets
CONFIG_SCHED_FAIR_TINY=y in the inline kernel .config block, and
extends the post-olddefconfig verifier with a positive presence check.
Result: linux.axf 1,204,768 -> 1,188,352 bytes (-16,416 / -1.36%);
vmlinux .text 729,380 -> 713,412 (-15,968), .rodata -96, .init.text
-116, .bss -36, .data -32; kernel/sched/fair.c collapses 16,782 / 97
syms -> ~1,160 bytes / 22 syms; pick_task_fair compiles to 24 bytes
of pure O(1) machine code (find_first_bit + list head deref +
container_of, no loops or rb-tree walks).
QEMU MPS2-AN386 boots clean to the BusyBox shell across three
back-to-back validate-qemu.sh runs against the full PGO workload
(17 fork/exec/wait sequences exercising hush spawn, cp /bin/busybox,
mv, ln, mkdir, rm, test pipelines).1 parent 9000871 commit cff2907
2 files changed
Lines changed: 530 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
666 | 666 | | |
667 | 667 | | |
668 | 668 | | |
669 | | - | |
| 669 | + | |
670 | 670 | | |
671 | 671 | | |
672 | 672 | | |
| |||
938 | 938 | | |
939 | 939 | | |
940 | 940 | | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
941 | 955 | | |
942 | 956 | | |
943 | 957 | | |
| |||
1003 | 1017 | | |
1004 | 1018 | | |
1005 | 1019 | | |
1006 | | - | |
| 1020 | + | |
| 1021 | + | |
1007 | 1022 | | |
1008 | 1023 | | |
1009 | 1024 | | |
| |||
0 commit comments