Commit cef72c7
committed
Kernel: Use word-sized writes in the generic memset implementation
This decreases the boot time on my x86-64 host system from 15 s to 13 s
for AArch64 QEMU TCG and 33 s to 30 s for RISC-V QEMU TCG!
I additionally measured the performance of this new implementation
with this simple benchmark:
https://gist.github.com/spholz/b06ea737b435ecc181069cf0d911faa4
Based on to this benchmark, an unroll level 8 seems like a good choice
for all tested systems.
Here are the speedups for n=0x10000:
- Raspberry Pi 5: 7.6 ( 81984 ns -> 10748 ns)
- Raspberry Pi 4: 3.3 (131197 ns -> 39704 ns)
- StarFive VisionFive 2: 5.5 (279107 ns -> 50650 ns)
- AArch64 QEMU TCG: 6.8 (374287 ns -> 54847 ns)
- RISC-V QEMU TCG: 6.7 (354195 ns -> 52615 ns)
- x86-64 QEMU KVM: 3.8 ( 32443 ns -> 8542 ns)1 parent 79f06a1 commit cef72c7
1 file changed
+30
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
74 | | - | |
75 | | - | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
76 | 102 | | |
| 103 | + | |
77 | 104 | | |
78 | 105 | | |
79 | 106 | | |
| |||
0 commit comments