Skip to content

Commit eb4f4ba

Browse files
committed
gem5.sh: simplify, and do m5 resetstats and m5 exit
This covers the most common use case of running a benchmark after restore.
1 parent 865d065 commit eb4f4ba

File tree

3 files changed

+41
-41
lines changed

3 files changed

+41
-41
lines changed

README.adoc

Lines changed: 36 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -6876,7 +6876,7 @@ gem5 full system:
68766876
....
68776877
printf 'm5 exit' > data/readfile
68786878
./run -a a -g -F '/gem5.sh'
6879-
printf 'm5 resetstats;dhrystone 100000;m5 exit' > data/readfile
6879+
printf 'dhrystone 100000' > data/readfile
68806880
time ./run -a a -l 1 -g
68816881
....
68826882

@@ -7410,33 +7410,37 @@ OK, this is why we used gem5 in the first place, performance measurements!
74107410

74117411
Let's benchmark https://en.wikipedia.org/wiki/Dhrystone[Dhrystone] which Buildroot provides.
74127412

7413-
The most flexible way is to do:
7413+
A flexible setup is:
74147414

74157415
....
74167416
arch=aarch64
7417+
cmd="./run -a '$arch' -g -F '/gem5.sh'"
7418+
restore='-l 1 -- --cpu-type=HPI --restore-with-cpu=HPI --caches --l2cache --l1d_size=1024kB --l1i_size=1024kB --l2_size=1024kB --l3_size=1024kB'
74177419
7418-
# Generate a checkpoint after Linux boots.
7420+
# Generate a checkpoint after Linux boots, using the faster and less detailed CPU.
74197421
# The boot takes a while, be patient young Padawan.
7420-
printf 'm5 exit' > data/readfile
7421-
./run -a "$arch" -g -F '/gem5.sh'
7422+
printf '' > data/readfile
7423+
eval "$cmd"
74227424
7423-
# Restore the most recent checkpoint taken, and run the benchmark
7424-
# with parameter 1.000. We skip the boot completely, saving time!
7425-
printf 'm5 resetstats;dhrystone 1000;m5 exit' > data/readfile
7426-
./run -a "$arch" -g -l 1
7425+
# Restore the most recent checkpoint taken with the more detailed and slower HPI CPU,
7426+
# and run the benchmark with parameter 1.000. We skip the boot completely, saving time!
7427+
printf 'dhrystone 1000' > data/readfile
7428+
eval "${cmd} ${restore}"
74277429
./gem5-stat -a "$arch"
74287430
7429-
# Now with another parameter 10.000.
7430-
printf 'm5 resetstats;dhrystone 10000;m5 exit' > data/readfile
7431-
./run -a "$arch" -g -l 1
7431+
# Now run again with another parameter 10.000.
7432+
# This one should take more cycles!
7433+
printf 'dhrystone 10000' > data/readfile
7434+
eval "${cmd} ${restore}"
74327435
./gem5-stat -a "$arch"
74337436
7434-
# Get an interactive shell at the end of the restore.
7435-
printf '' > data/readfile
7436-
./run -a "$arch" -g -l 1
7437+
# Get an interactive shell at the end of the restore
7438+
# if you need to debug something more interactively.
7439+
printf 'sh' > data/readfile
7440+
eval "${cmd} ${restore}"
74377441
....
74387442

7439-
The commands output the approximate number of CPU cycles it took Dhrystone to run.
7443+
The `gem5-stats` commands output the approximate number of CPU cycles it took Dhrystone to run.
74407444

74417445
For more serious tests, you will likely want to automate logging the commands ran and results to files, a good example is: link:gem5-bench-cache[].
74427446

@@ -7448,20 +7452,6 @@ A more naive and simpler to understand approach would be a direct:
74487452

74497453
but the problem is that this method does not allow to easily run a different script without running the boot again, see: <<gem5-restore-new-scrip>>
74507454

7451-
A few imperfections of our benchmarking method are:
7452-
7453-
* when we do `m5 resetstats` and `m5 exit`, there is some time passed before the `exec` system call returns and the actual benchmark starts and ends
7454-
* the benchmark outputs to stdout, which means so extra cycles in addition to the actual computation. But TODO: how to get the output to check that it is correct without such IO cycles?
7455-
7456-
Solutions to these problems include:
7457-
7458-
* modify benchmark code with instrumentation directly, see <<m5ops-instructions>> for an example.
7459-
* monitor known addresses TODO possible? Create an example.
7460-
7461-
Discussion at: https://stackoverflow.com/questions/48944587/how-to-count-the-number-of-cpu-clock-cycles-between-the-start-and-end-of-a-bench/48944588#48944588
7462-
7463-
Those problems should be insignificant if the benchmark runs for long enough however.
7464-
74657455
Now you can play a fun little game with your friends:
74667456

74677457
* pick a computational problem
@@ -7482,6 +7472,22 @@ Whenever we run `m5 dumpstats` or `m5 exit`, a section with the following format
74827472
---------- End Simulation Statistics ----------
74837473
....
74847474

7475+
==== Skip extra benchmark instructions
7476+
7477+
A few imperfections of our <<gem5-run-benchmark,benchmarking method>> are:
7478+
7479+
* when we do `m5 resetstats` and `m5 exit`, there is some time passed before the `exec` system call returns and the actual benchmark starts and ends
7480+
* the benchmark outputs to stdout, which means so extra cycles in addition to the actual computation. But TODO: how to get the output to check that it is correct without such IO cycles?
7481+
7482+
Solutions to these problems include:
7483+
7484+
* modify benchmark code with instrumentation directly, see <<m5ops-instructions>> for an example.
7485+
* monitor known addresses TODO possible? Create an example.
7486+
7487+
Discussion at: https://stackoverflow.com/questions/48944587/how-to-count-the-number-of-cpu-clock-cycles-between-the-start-and-end-of-a-bench/48944588#48944588
7488+
7489+
Those problems should be insignificant if the benchmark runs for long enough however.
7490+
74857491
==== gem5 system parameters
74867492

74877493
Besides optimizing a program for a given CPU setup, chip developers can also do the inverse, and optimize the chip for a given benchmark!

gem5-bench-cache

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -83,14 +83,9 @@ fi
8383

8484
# Restore and run benchmarks.
8585
rm -f "$results_file"
86-
printf '#!/bin/sh
87-
m5 resetstats
88-
dhrystone XXX
89-
m5 exit
90-
' >"${common_gem5_readfile_file}"
9186
for n in 1000 10000 100000; do
9287
printf "n ${n}\n" >> "$results_file"
93-
sed -Ei "s/^dhrystone .*/dhrystone ${n}/" "${common_gem5_readfile_file}"
88+
printf "dhrystone ${n}" > "${common_gem5_readfile_file}"
9489
bench-all
9590
printf "\n" >> "$results_file"
9691
done

rootfs_overlay/gem5.sh

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
#!/bin/sh
2+
# This covers the most common setup to run a benchmark in gem5 and exit.
23
m5 checkpoint
3-
script=/tmp/readfile
4-
m5 readfile > "$script"
5-
if [ -s "$script" ]; then
6-
sh "$script"
7-
fi
4+
m5 resetstats
5+
m5 readfile | sh
6+
m5 exit

0 commit comments

Comments
 (0)