@@ -12,26 +12,26 @@ sudo apt install linux-tools-common linux-tools-generic
12
12
sudo sysctl kernel.perf_event_paranoid=1
13
13
scala-cli --power package --assembly example01.sc --force -o example01.jar
14
14
ls -l ./example01.jar
15
- perf stat -r 100 ./example01.jar > /dev/null
15
+ perf stat -r 100 java --sun-misc-unsafe-memory-access=allow -jar ./example01.jar > /dev/null
16
16
```
17
17
Expected output:
18
18
``` text
19
19
Performance counter stats for './example01.jar' (100 runs):
20
20
21
- 202.75 msec task-clock # 2.286 CPUs utilized ( +- 0.27 % )
22
- 2,191 context-switches # 10.806 K/sec ( +- 1.53 % )
23
- 39 cpu-migrations # 192.356 /sec ( +- 2.49 % )
24
- 24,775 page-faults # 122.195 K/sec ( +- 0.27 % )
25
- 303,999,880 cpu_atom/cycles / # 1.499 GHz ( +- 3.51% ) (71.71% )
26
- 1,005,678,639 cpu_core/cycles / # 4.960 GHz ( +- 0.44 % ) (77.24 %)
27
- 431,508,692 cpu_atom/instructions / # 1.42 insn per cycle ( +- 3.69% ) (71.71% )
28
- 1,684,651,998 cpu_core/instructions / # 5.54 insn per cycle ( +- 0.47 % ) (77.24 %)
29
- 83,740,906 cpu_atom/branches/ # 413.027 M /sec ( +- 3.78% ) (71.71% )
30
- 335,230,949 cpu_core/branches/ # 1.653 G /sec ( +- 0.49 % ) (77.24 %)
31
- 2,017,405 cpu_atom/branch-misses/ # 2.41 % of all branches ( +- 4.20% ) (71.71% )
32
- 9,450,892 cpu_core/branch-misses/ # 11.29 % of all branches ( +- 0.47 % ) (77.24 %)
21
+ 205.66 msec task-clock # 2.215 CPUs utilized ( +- 0.30 % )
22
+ 1,624 context-switches # 7.897 K/sec ( +- 0.83 % )
23
+ 17 cpu-migrations # 82.662 /sec ( +- 1.67 % )
24
+ 23,158 page-faults # 112.605 K/sec ( +- 0.09 % )
25
+ 1,195,984,140 cpu_atom/instructions / # 1.57 insn per cycle ( +- 1.24% )
26
+ 906,422,673 cpu_core/instructions / # 1.77 insn per cycle ( +- 2.94 % ) (39.78 %)
27
+ 762,329,450 cpu_atom/cycles / # 3.707 GHz ( +- 1.34% )
28
+ 513,382,841 cpu_core/cycles / # 2.496 GHz ( +- 3.75 % ) (39.78 %)
29
+ 237,678,684 cpu_atom/branches/ # 1.156 G /sec ( +- 1.24% )
30
+ 174,316,667 cpu_core/branches/ # 847.610 M /sec ( +- 3.18 % ) (39.78 %)
31
+ 6,725,338 cpu_atom/branch-misses/ # 2.83 % of all branches ( +- 1.78% )
32
+ 5,919,295 cpu_core/branch-misses/ # 3.40 % of all branches ( +- 4.63 % ) (39.78 %)
33
33
34
- 0.088700 +- 0.000135 seconds time elapsed ( +- 0.15 % )
34
+ 0.092830 +- 0.000511 seconds time elapsed ( +- 0.55 % )
35
35
```
36
36
37
37
### Build Scala JS output, print its size, and measure its start up time with ` node ` (tested with node.js 22)
@@ -49,20 +49,20 @@ Expected output:
49
49
``` text
50
50
Performance counter stats for 'node ./example01.js' (100 runs):
51
51
52
- 14.90 msec task-clock # 1.014 CPUs utilized ( +- 0.30 % )
53
- 20 context-switches # 1.343 K/sec ( +- 1.60 % )
54
- 2 cpu-migrations # 134.254 /sec ( +- 5.08 % )
55
- 2,748 page-faults # 184.466 K/sec ( +- 0.00 % )
56
- 45,371,808 cpu_atom/cycles / # 3.046 GHz ( +- 2.61% ) (20.12% )
57
- 75,772,761 cpu_core/cycles / # 5.086 GHz ( +- 0.32 % ) (85.22 %)
58
- 97,892,289 cpu_atom/instructions / # 2.16 insn per cycle ( +- 3.38% ) (20.12% )
59
- 165,775,873 cpu_core/instructions / # 3.65 insn per cycle ( +- 0.29 % ) (85.22 %)
60
- 16,573,542 cpu_atom/branches/ # 1.113 G/sec ( +- 3.35% ) (20.12% )
61
- 28,245,004 cpu_core/branches/ # 1.896 G/sec ( +- 0.27 % ) (85.22 %)
62
- 97,278 cpu_atom/branch-misses/ # 0.59 % of all branches ( +- 11.08% ) (20.12% )
63
- 573,362 cpu_core/branch-misses/ # 3.46% of all branches ( +- 0.72 % ) (85.22 %)
52
+ 17.64 msec task-clock # 0.994 CPUs utilized ( +- 0.14 % )
53
+ 18 context-switches # 1.021 K/sec ( +- 1.43 % )
54
+ 2 cpu-migrations # 113.401 /sec ( +- 2.30 % )
55
+ 2,760 page-faults # 156.494 K/sec ( +- 0.01 % )
56
+ 170,140,863 cpu_atom/instructions / # 2.12 insn per cycle ( +- 0.01% )
57
+ <not counted> cpu_core/instructions / ( +- 26.12 % ) (0.00 %)
58
+ 80,128,642 cpu_atom/cycles / # 4.543 GHz ( +- 0.14% )
59
+ <not counted> cpu_core/cycles / ( +- 26.44 % ) (0.00 %)
60
+ 28,956,452 cpu_atom/branches/ # 1.642 G/sec ( +- 0.01% )
61
+ <not counted> cpu_core/branches/ ( +- 25.84 % ) (0.00 %)
62
+ 560,195 cpu_atom/branch-misses/ # 1.93 % of all branches ( +- 0.04% )
63
+ <not counted> cpu_core/branch-misses/ ( +- 24.67 % ) (0.00 %)
64
64
65
- 0.0146864 +- 0.0000435 seconds time elapsed ( +- 0.30 % )
65
+ 0.0177346 +- 0.0000278 seconds time elapsed ( +- 0.16 % )
66
66
```
67
67
68
68
### Build Scala JS Wasm output, print its size, and measure its start up time with ` node ` (tested with node.js 22)
@@ -81,20 +81,20 @@ Expected output:
81
81
``` text
82
82
Performance counter stats for 'node --experimental-wasm-exnref ./example01.js/main.js' (100 runs):
83
83
84
- 23.36 msec task-clock # 1.036 CPUs utilized ( +- 0.23 % )
85
- 61 context-switches # 2.611 K/sec ( +- 0.47 % )
86
- 3 cpu-migrations # 128.413 /sec ( +- 4.53 % )
87
- 3,156 page-faults # 135.090 K/sec ( +- 0.01% )
88
- 81,924,942 cpu_atom/cycles / # 3.507 GHz ( +- 0.93% ) (10.81% )
89
- 115,425,581 cpu_core/cycles / # 4.941 GHz ( +- 0.30% )
90
- 151,891,461 cpu_atom/instructions / # 1.85 insn per cycle ( +- 1.65% ) (10.81% )
91
- 228,699,784 cpu_core/instructions / # 2.79 insn per cycle ( +- 0.23% )
92
- 26,745,975 cpu_atom/branches/ # 1.145 G/sec ( +- 1.49% ) (10.81% )
93
- 39,592,483 cpu_core/branches/ # 1.695 G/sec ( +- 0.24% )
94
- 459,225 cpu_atom/branch-misses/ # 1.72 % of all branches ( +- 1.76% ) (10.81% )
95
- 1,062,312 cpu_core/branch-misses/ # 3.97% of all branches ( +- 0.30% )
84
+ 27.40 msec task-clock # 1.023 CPUs utilized ( +- 0.11 % )
85
+ 57 context-switches # 2.080 K/sec ( +- 0.54 % )
86
+ 3 cpu-migrations # 109.497 /sec ( +- 3.18 % )
87
+ 3,187 page-faults # 116.323 K/sec ( +- 0.01% )
88
+ 242,622,105 cpu_atom/instructions / # 1.95 insn per cycle ( +- 0.01% )
89
+ <not counted> cpu_core/instructions / ( +- 30.95% ) (0.00% )
90
+ 124,113,120 cpu_atom/cycles / # 4.530 GHz ( +- 0.10% )
91
+ <not counted> cpu_core/cycles / ( +- 30.29% ) (0.00% )
92
+ 42,063,607 cpu_atom/branches/ # 1.535 G/sec ( +- 0.01% )
93
+ <not counted> cpu_core/branches/ ( +- 31.04% ) (0.00% )
94
+ 1,072,114 cpu_atom/branch-misses/ # 2.55 % of all branches ( +- 0.04% )
95
+ <not counted> cpu_core/branch-misses/ ( +- 29.68% ) (0.00% )
96
96
97
- 0.0225519 +- 0.0000530 seconds time elapsed ( +- 0.23 % )
97
+ 0.0267775 +- 0.0000388 seconds time elapsed ( +- 0.14 % )
98
98
```
99
99
100
100
### Build GraalVM native image, print its size, and measure its start up time (tested with Oracle GraalVM 24)
@@ -110,49 +110,49 @@ Expected output:
110
110
``` text
111
111
Performance counter stats for './example01_graalvm.bin' (100 runs):
112
112
113
- 1.08 msec task-clock # 0.911 CPUs utilized ( +- 0.73 % )
114
- 1 context-switches # 927.547 /sec ( +- 4.79 % )
113
+ 1.01 msec task-clock # 0.893 CPUs utilized ( +- 0.46 % )
114
+ 1 context-switches # 988.945 /sec ( +- 4.40 % )
115
115
0 cpu-migrations # 0.000 /sec
116
- 477 page-faults # 442.440 K/sec ( +- 0.01% )
117
- 4,315,458 cpu_atom/cycles / # 4.003 GHz ( +- 1.85% ) (19.75% )
118
- 3,872,466 cpu_core/cycles / # 3.592 GHz ( +- 5.11% )
119
- 10,829,542 cpu_atom/instructions / # 2.51 insn per cycle ( +- 2.48% ) (19.75% )
120
- 5,351,563 cpu_core/instructions / # 1.24 insn per cycle ( +- 5.21% )
121
- 1,856,118 cpu_atom/branches/ # 1.722 G/sec ( +- 2.33% ) (19.75% )
122
- 1,000,400 cpu_core/branches/ # 927.918 M/sec ( +- 5.18% )
123
- 7,651 cpu_atom/branch-misses/ # 0.41 % of all branches ( +- 6.90% ) (19.75% )
124
- 11,792 cpu_core/branch-misses/ # 0.64% of all branches ( +- 5.28% )
116
+ 470 page-faults # 464.804 K/sec ( +- 0.01% )
117
+ 7,312,659 cpu_atom/instructions / # 1.59 insn per cycle ( +- 0.10% )
118
+ <not counted> cpu_core/instructions / (0.00% )
119
+ 4,591,123 cpu_atom/cycles / # 4.540 GHz ( +- 0.37% )
120
+ <not counted> cpu_core/cycles / (0.00% )
121
+ 1,319,275 cpu_atom/branches/ # 1.305 G/sec ( +- 0.09% )
122
+ <not counted> cpu_core/branches/ (0.00% )
123
+ 10,884 cpu_atom/branch-misses/ # 0.82 % of all branches ( +- 1.27% )
124
+ <not counted> cpu_core/branch-misses/ (0.00% )
125
125
126
- 0.00118361 +- 0.00000667 seconds time elapsed ( +- 0.56 % )
126
+ 0.00113238 +- 0.00000577 seconds time elapsed ( +- 0.51 % )
127
127
```
128
128
129
129
### Build Scala Native image, print its size, and measure its start up time (tested with Scala Native 0.5.7)
130
130
131
131
``` sh
132
132
sudo apt install linux-tools-common linux-tools-generic clang libstdc++-12-dev libgc-dev
133
133
sudo sysctl kernel.perf_event_paranoid=1
134
- scala-cli --power package --native-version 0.5.7 --native example01.sc --native-mode release-full --force -o example01_native.bin
134
+ scala-cli --power package --native-version 0.5.8 --native example01.sc --native-mode release-full --force -o example01_native.bin
135
135
ls -l ./example01_native.bin
136
136
perf stat -r 100 ./example01_native.bin > /dev/null
137
137
```
138
138
Expected output:
139
139
``` text
140
140
Performance counter stats for './example01_native.bin' (100 runs):
141
141
142
- 0.71 msec task-clock # 0.855 CPUs utilized ( +- 0.64 % )
142
+ 0.81 msec task-clock # 0.801 CPUs utilized ( +- 0.22 % )
143
143
0 context-switches # 0.000 /sec
144
144
0 cpu-migrations # 0.000 /sec
145
- 731 page-faults # 1.024 M /sec ( +- 0.01 % )
146
- <not counted> cpu_atom/cycles / ( +-100.00% ) (0.00% )
147
- 3,769,012 cpu_core/cycles / # 5.280 GHz ( +- 1.17% )
148
- <not counted> cpu_atom/instructions / ( +-100.00% ) (0.00% )
149
- 5,907,572 cpu_core/instructions / # 139.11 insn per cycle ( +- 1.02% )
150
- <not counted> cpu_atom/branches/ ( +-100.01% ) (0.00% )
151
- 1,055,882 cpu_core/branches/ # 1.479 G/sec ( +- 1.02% )
152
- <not counted> cpu_atom/branch-misses/ ( +-100.62% ) (0.00% )
153
- 9,104 cpu_core/branch-misses/ # 84.85% of all branches ( +- 1.35% )
154
-
155
- 0.00083490 +- 0.00000519 seconds time elapsed ( +- 0.62 % )
145
+ 732 page-faults # 900.150 K /sec ( +- 0.02 % )
146
+ 5,950,991 cpu_atom/instructions / # 1.60 insn per cycle ( +- 0.08% )
147
+ <not counted> cpu_core/instructions / (0.00% )
148
+ 3,728,384 cpu_atom/cycles / # 4.585 GHz ( +- 0.22% )
149
+ <not counted> cpu_core/cycles / (0.00% )
150
+ 1,062,936 cpu_atom/branches/ # 1.307 G/sec ( +- 0.08% )
151
+ <not counted> cpu_core/branches/ (0.00% )
152
+ 7,488 cpu_atom/branch-misses/ # 0.70% of all branches ( +- 1.09% )
153
+ <not counted> cpu_core/branch-misses/ (0.00% )
154
+
155
+ 0.00101568 +- 0.00000732 seconds time elapsed ( +- 0.72 % )
156
156
```
157
157
158
158
## RFC-8259 validation (example02)
0 commit comments