Commit 0a825f2
committed
Add kernel syscall-prune build cycle
The existing PGO cycle bundles config-only, syscall-prune, and
layout-ordering candidates into a single rebuild loop. When all
we want is the syscall-prune verdict (e.g. tightening the prune
table after a userspace change), paying for the layout-ordering
detour is wasteful, and the QEMU trace + analyze step gets re-run
on every invocation.
build.sh grows kernel_syscall_prune_cycle alongside the existing
kernel_pgo_cycle: same primitives (build_candidate_kernel,
boot_not_regressed, restore_kernel_artifacts), but only baseline /
config-only / syscall-prune candidates, defaulting QEMU_LOG to
exec,cpu,in_asm so the R7 dumps needed for syscall extraction are
present. The stage selects the smallest linux.axf that does not
regress shell_ready_ms, mirroring the existing cycle's selection
rule.
prepare_kernel_profile_analysis extracts the trace-collect+analyze
pair into a shared helper keyed on (IMAGE_FP, sha256(workload),
trace selector). Both cycles share the namespace but resolve to
distinct cache directories via the trace tag, so PGO's exec,in_asm
trace and the new cycle's exec,cpu,in_asm trace do not collide.
IMAGE_FP already covers scripts/, configs/, patches/, and build.sh,
so any change that affects the baseline binary or the analysis
tooling invalidates the cache.
materialize_cache_tree uses cp -f rather than hardlinks:
link_cached_tree exposes cache files via symlinks in working dirs,
and a hardlink would let any future writer that follows that
symlink mutate the cache through a shared inode. cp -f makes the
cache the canonical copy.
scripts/qemu-trace-to-orderfile.py rewrites the syscall pairing
state machine to match QEMU's actual -d cpu ordering: register
dumps precede the TB whose entry state they capture. The previous
code kept a backward-binding fallback that, when a TB without a
preceding regdump showed up (orphan TB), bound the next R07 to the
orphan instead of holding it for the upcoming TB. In a real trace
this silently misattributes or drops syscalls and feeds a wrong
prune table. pending_tb is dropped entirely; an R07 always binds
forward to the next TB or stays pending until one arrives.
scripts/test_qemu_trace_to_orderfile.py covers the happy path, the
orphan-TB regression (which fails under the old backward-binding),
and the MAX_ARM_SYSCALL boundary at 511/512.
Three Python script invocations gain an explicit python3 prefix.
generate-syscall-prune-table.py is committed as 100644, so calling
it via the bare path raised Permission denied; the other two are
defensive consistency.1 parent 1edb897 commit 0a825f2
5 files changed
Lines changed: 429 additions & 35 deletions
File tree
- .github/workflows
- scripts
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
183 | 183 | | |
184 | 184 | | |
185 | 185 | | |
186 | | - | |
| 186 | + | |
187 | 187 | | |
188 | 188 | | |
189 | 189 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
| 67 | + | |
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
82 | 85 | | |
83 | | - | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
84 | 89 | | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
85 | 96 | | |
86 | 97 | | |
87 | 98 | | |
| |||
0 commit comments