Skip to content

Commit a8055cd

Browse files
committed
gem5: explain m5ops and create minimal one liner copy pastes
1 parent b19256b commit a8055cd

File tree

2 files changed

+175
-13
lines changed

2 files changed

+175
-13
lines changed

README.adoc

Lines changed: 115 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5530,8 +5530,8 @@ A few imperfections of our benchmarking method are:
55305530

55315531
Solutions to these problems include:
55325532

5533-
* modify benchmark code with instrumentation directly, as PARSEC and ARM employees have been doing: https://github.com/arm-university/arm-gem5-rsk/blob/aa3b51b175a0f3b6e75c9c856092ae0c8f2a7cdc/parsec_patches/xcompile-patch.diff#L230
5534-
* monitor known addresses
5533+
* modify benchmark code with instrumentation directly, see <<m5ops-instructions>> for an example.
5534+
* monitor known addresses TODO possible? Create an example.
55355535

55365536
Discussion at: https://stackoverflow.com/questions/48944587/how-to-count-the-number-of-cpu-clock-cycles-between-the-start-and-end-of-a-bench/48944588#48944588
55375537

@@ -6243,31 +6243,47 @@ Cycles instead of instructions:
62436243

62446244
Otherwise the simulation runs forever by default.
62456245

6246-
=== m5
6246+
=== m5ops
62476247

6248-
`m5` is a guest command line utility that is installed and run on the guest.
6248+
m5ops are magic instructions which lead gem5 to do magic things, like quitting or dumping stats.
62496249

6250-
Its source is present under the gem5 main tree.
6250+
Documentation: http://gem5.org/M5ops
62516251

6252-
It generates magic instructions, which lead gem5 to do magic things, like `dumpstats` or `exit`.
6252+
There are two main ways to use m5ops:
62536253

6254-
It is however under-documented, so let's document some of its capabilities here.
6254+
* <<m5>>
6255+
* <<m5ops-instructions>>
62556256

6256-
Part of those explanations could be deduced from the documentation of the magic instructions themselves: http://gem5.org/M5ops
6257+
`m5` is convenient if you only want to take snapshots before or after the benchmark, without altering its source code. It uses the <<m5ops-instructions>> as its backend.
62576258

6258-
==== m5 exit
6259+
`m5` cannot should / should not be used however:
6260+
6261+
* in bare metal setups
6262+
* when you want to call the instructions from inside interest points of your benchmark. Otherwise you add the syscall overhead to the benchmark, which is more intrusive and might affect results.
6263+
+
6264+
Why not just hardcode some <<m5ops-instructions>> as in our example instead, since you are going to modify the source of the benchmark anyways?
6265+
6266+
==== m5
6267+
6268+
`m5` is a guest command line utility that is installed and run on the guest, that serves as a CLI front-end for the <<m5ops>>
6269+
6270+
Its source is present in the gem5 tree: https://github.com/gem5/gem5/blob/6925bf55005c118dc2580ba83e0fa10b31839ef9/util/m5/m5.c
6271+
6272+
It is possible to guess what most tools do from the corresponding <<m5ops>>, but let's at least document the less obvious ones here.
6273+
6274+
===== m5 exit
62596275

62606276
Quit gem5 with exit status 0.
62616277

6262-
==== m5 fail
6278+
===== m5 fail
62636279

62646280
Quit gem5 with the given exit status.
62656281

62666282
....
62676283
m5 fail 1
62686284
....
62696285

6270-
==== m5 writefile
6286+
===== m5 writefile
62716287

62726288
Send a guest file to the host. <<9p>> is a more advanced alternative.
62736289

@@ -6290,7 +6306,7 @@ Does not work for subdirectories, gem5 crashes:
62906306
m5 writefile myfileguest mydirhost/myfilehost
62916307
....
62926308

6293-
==== m5 readfile
6309+
===== m5 readfile
62946310

62956311
https://stackoverflow.com/questions/49516399/how-to-use-m5-readfile-and-m5-execfile-in-gem5/49538051#49538051
62966312

@@ -6306,7 +6322,7 @@ Guest:
63066322
m5 readfile
63076323
....
63086324

6309-
==== m5 execfile
6325+
===== m5 execfile
63106326

63116327
Host:
63126328

@@ -6323,6 +6339,92 @@ chmod +x /tmp/execfile
63236339
m5 execfile
63246340
....
63256341

6342+
==== m5ops instructions
6343+
6344+
The executable `/m5ops.out` illustrates how to hard code with inline assembly the m5ops that you are most likely to hack into the benchmark you are analysing:
6345+
6346+
....
6347+
# checkpoint
6348+
/m5ops.out c
6349+
# dumpstats
6350+
/m5ops.out d
6351+
# dump exit
6352+
/m5ops.out e
6353+
# dump resetstats
6354+
/m5ops.out r
6355+
....
6356+
6357+
Source: link:kernel_module/user/m5ops.c[]
6358+
6359+
That executable is of course a subset of <<m5>> and useless by itself: its goal is only illustrate how to hardcode some <<m5ops>> yourself as one-liners.
6360+
6361+
In theory, the cleanest way to add m5ops to your benchmarks would be to do exactly what the `m5` tool does:
6362+
6363+
* include link:https://github.com/gem5/gem5/blob/05c4c2b566ce351ab217b2bd7035562aa7a76570/include/gem5/asm/generic/m5ops.h[`include/gem5/asm/generic/m5ops.h`]
6364+
* link with the `.o` file under `util/m5` for the correct arch, e.g. `m5op_arm_A64.o` for aarch64.
6365+
6366+
However, I think it is usually not worth the trouble of hacking up the build system of the benchmark to do this, and I recommend just hardcoding in a few raw instructions here and there, and managing it with version control + `sed`.
6367+
6368+
Related: https://www.mail-archive.com/[email protected]/msg15418.html
6369+
6370+
===== m5ops instructions interface
6371+
6372+
Let's study how <<m5>> uses them:
6373+
6374+
* link:https://github.com/gem5/gem5/blob/05c4c2b566ce351ab217b2bd7035562aa7a76570/include/gem5/asm/generic/m5ops.h[`include/gem5/asm/generic/m5ops.h`]: defines the magic constants that represent the instructions
6375+
* link:https://github.com/gem5/gem5/blob/05c4c2b566ce351ab217b2bd7035562aa7a76570/util/m5/m5op_arm_A64.S[`util/m5/m5op_arm_A64.S`]: use the magic constants that represent the instructions using C preprocessor magic
6376+
* link:https://github.com/gem5/gem5/blob/05c4c2b566ce351ab217b2bd7035562aa7a76570/util/m5/m5.c[`util/m5/m5.c`]: the actual executable. Gets linked to `m5op_arm_A64.S` which defines a function for each m5op.
6377+
6378+
We notice that there are two different implementations for each arch:
6379+
6380+
* magic instructions, which don't exist in the corresponding arch
6381+
* magic memory addresses on a given page
6382+
6383+
TODO: what is the advantage of magic memory addresses? Because you have to do more setup work by telling the kernel never to touch the magic page. For the magic instructions, the only thing that could go wrong is if you run some crazy kind of fuzzing workload that generates random instructions.
6384+
6385+
Then, in aarch64 magic instructions for example, the lines:
6386+
6387+
....
6388+
.macro m5op_func, name, func, subfunc
6389+
.globl \name
6390+
\name:
6391+
.long 0xff000110 | (\func << 16) | (\subfunc << 12)
6392+
ret
6393+
....
6394+
6395+
define a simple function function for each m5op. Here we see that:
6396+
6397+
* `0xff000110` is a base mask for the magic non-existing instruction
6398+
* `\func` and `\subfunc` are OR-applied on top of the base mask, and define m5op this is.
6399+
+
6400+
Those values will loop over the magic constants defined in `m5ops.h` with the deferred preprocessor idiom.
6401+
+
6402+
For example, `exit` is `0x21` due to:
6403+
+
6404+
....
6405+
#define M5OP_EXIT 0x21
6406+
....
6407+
6408+
Finally, `m5.c` calls the defined functions as in:
6409+
6410+
....
6411+
m5_exit(ints[0]);
6412+
....
6413+
6414+
Therefore, the runtime "argument" that gets passed to the instruction, e.g. the desired exit status in the case of `exit`, gets passed directly through the link:https://en.wikipedia.org/wiki/Calling_convention#ARM_(A64)[aarch64 calling convention].
6415+
6416+
That convention specifies that `x0` to `x7` contain the function arguments, so `x0` contains the first argument, and `x1` the second.
6417+
6418+
In our `m5ops` example, we just hardcode everything in the assembly one-liners we are producing.
6419+
6420+
We ignore the `\subfunc` since it is always 0 on the ops that interest us.
6421+
6422+
===== m5op annotations
6423+
6424+
`include/gem5/asm/generic/m5ops.h` also describes some annotation instructions.
6425+
6426+
What they mean: https://stackoverflow.com/questions/50583962/what-are-the-gem5-annotations-mops-magic-instructions-and-how-to-use-them
6427+
63266428
=== gem5 arm Linux kernel patches
63276429

63286430
https://gem5.googlesource.com/arm/linux/ contains an ARM Linux kernel fork with a few gem5 specific Linux kernel patches on top of mainline created by ARM Holdings.

kernel_module/user/m5ops.c

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
#include <stdint.h>
2+
#include <stdio.h>
3+
#include <stdlib.h>
4+
5+
#define ENABLED 1
6+
#if defined(__aarch64__)
7+
static void m5_checkpoint()
8+
{
9+
__asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xff000110 | (0x43 << 16);");
10+
};
11+
static void m5_dump_stats()
12+
{
13+
__asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xff000110 | (0x41 << 16);");
14+
};
15+
static void m5_exit()
16+
{
17+
__asm__ __volatile__ ("mov x0, 0; .inst 0xff000110 | (0x21 << 16);");
18+
};
19+
static void m5_reset_stats()
20+
{
21+
__asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xff000110 | (0x40 << 16);");
22+
};
23+
#else
24+
#undef ENABLED
25+
#define ENABLED 0
26+
#endif
27+
28+
int main(
29+
#if ENABLED
30+
int argc, char **argv
31+
#else
32+
void
33+
#endif
34+
)
35+
{
36+
#if defined(__aarch64__)
37+
char action;
38+
if (argc > 1) {
39+
action = argv[1][0];
40+
} else {
41+
action = 'e';
42+
}
43+
switch (action)
44+
{
45+
case 'c':
46+
m5_checkpoint(0, 0);
47+
break;
48+
case 'd':
49+
m5_dump_stats(0, 0);
50+
break;
51+
case 'e':
52+
m5_exit();
53+
break;
54+
case 'r':
55+
m5_reset_stats();
56+
break;
57+
}
58+
#endif
59+
return EXIT_SUCCESS;
60+
}

0 commit comments

Comments
 (0)