Skip to content

Commit ef4fa33

Browse files
committed
x86 asm: move RDTSC from x86-assembly-cheat, create RDTSCP
1 parent 658ac53 commit ef4fa33

File tree

6 files changed

+123
-30
lines changed

6 files changed

+123
-30
lines changed

README.adoc

Lines changed: 69 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -12489,47 +12489,35 @@ Generated some polemic when kernel devs wanted to use it as part of `/dev/random
1248912489

1249012490
RDRAND sets the carry flag when data is ready so we must loop if the carry flag isn't set.
1249112491

12492-
=== x86 SIMD
12492+
=== x86 system instructions
1249312493

12494-
History:
12494+
<<intel-manual-1>> 5.20 "SYSTEM INSTRUCTIONS"
1249512495

12496-
* link:https://en.wikipedia.org/wiki/MMX_(instruction_set)[MMX]: MultiMedia eXtension (unofficial name). 1997. MM0-MM7 64-bit registers.
12497-
* link:https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions[SSE]: Streaming SIMD Extensions. 1999. XMM0-XMM7 128-bit registers, XMM0-XMM15 for AMD in 64-bit mode.
12498-
* link:https://en.wikipedia.org/wiki/SSE2[SSE2]: 2004
12499-
* link:https://en.wikipedia.org/wiki/SSE3[SSE3]: 2006
12500-
* link:https://en.wikipedia.org/wiki/SSE4[SSE4]: 2006
12501-
* link:https://en.wikipedia.org/wiki/Advanced_Vector_Extensions[AVX]: Advanced Vector Extensions. 2011. YMM0–YMM15 256-bit registers in 64-bit mode. Extension of XMM.
12502-
* AVX2:2013
12503-
* AVX-512: 2016. 512-bit ZMM registers. Extension of YMM.
12504-
12505-
==== x86 SSE2
12506-
12507-
===== x86 ADDPD instruction
12508-
12509-
link:userland/arch/x86_64/addpd.S[]: ADDPS, ADDPD
12496+
==== x86 RDTSC instruction
1251012497

12511-
Good first instruction to learn SIMD: <<simd-assembly>>
12498+
Sources:
1251212499

12513-
===== x86 PADDQ instruction
12500+
* link:userland/arch/x86_64/rdtsc.S[]
12501+
* link:userland/arch/x86_64/intrinsics/rdtsc.c[]
1251412502

12515-
link:userland/arch/x86_64/paddq.S[]: PADDQ, PADDL, PADDW, PADDB
12503+
Try running the programs multiple times, and watch the value increase, and then try to correlate it with `/proc/cpuinfo` frequency!
1251612504

12517-
Good first instruction to learn SIMD: <<simd-assembly>>
12505+
....
12506+
while true; do sleep 1 && ./userland/arch/x86_64/rdtsc.out; done
12507+
....
1251812508

12519-
=== x86 RDTSC instruction
12509+
RDTSC stores its output to EDX:EAX, even in 64-bit mode, top bits are zeroed out.
1252012510

1252112511
TODO: review this section, make a more controlled userland experiment with <<m5ops>> instrumentation.
1252212512

1252312513
Let's have some fun and try to correlate the gem5 <<stats-txt>> `system.cpu.numCycles` cycle count with the link:https://en.wikipedia.org/wiki/Time_Stamp_Counter[x86 RDTSC instruction] that is supposed to do the same thing:
1252412514

1252512515
....
12526-
./build-userland --static userland/arch/x86_64/inline_asm/rdtsc.c
12527-
./run --eval './arch/x86_64/c/rdtsc.out;m5 exit;' --emulator gem5
12516+
./build-userland --static userland/arch/x86_64/inline_asm/rdtsc.S
12517+
./run --eval './arch/x86_64/rdtsc.out;m5 exit;' --emulator gem5
1252812518
./gem5-stat
1252912519
....
1253012520

12531-
Source: link:userland/arch/x86_64/rdtsc.c[]
12532-
1253312521
RDTSC outputs a cycle count which we compare with gem5's `gem5-stat`:
1253412522

1253512523
* `3828578153`: RDTSC
@@ -12544,14 +12532,69 @@ Bibliography:
1254412532
* https://en.wikipedia.org/wiki/Time_Stamp_Counter
1254512533
* https://stackoverflow.com/questions/9887839/clock-cycle-count-wth-gcc/9887979
1254612534

12547-
==== ARM PMCCNTR register
12535+
===== x86 RDTSCP instruction
12536+
12537+
RDTSCP is like RDTSP, but it also stores the CPU ID into ECX: this is convenient because the value of RDTSC depends on which core we are currently on, so you often also want the core ID when you want the RDTSC.
12538+
12539+
Sources:
12540+
12541+
* link:userland/arch/x86_64/rdtscp.S[]
12542+
* link:userland/arch/x86_64/intrinsics/rdtscp.c[]
12543+
12544+
We can observe its operation with the good and old `taskset`, for example:
12545+
12546+
....
12547+
taskset -c 0 ./userland/arch/x86_64/rdtscp.out | tail -n 1
12548+
taskset -c 1 ./userland/arch/x86_64/rdtscp.out | tail -n 1
12549+
....
12550+
12551+
produces:
12552+
12553+
....
12554+
0x00000000
12555+
0x00000001
12556+
....
12557+
12558+
12559+
There is also the RDPID instruction that reads just the processor ID, but it appears to be very new for QEMU 4.0.0 or <<p51>>, as it fails with SIGILL on both.
12560+
12561+
Bibliography: https://stackoverflow.com/questions/22310028/is-there-an-x86-instruction-to-tell-which-core-the-instruction-is-being-run-on/56622112#56622112
12562+
12563+
===== ARM PMCCNTR register
1254812564

1254912565
TODO We didn't manage to find a working ARM analogue to <<x86-rdtsc-instruction>>: link:kernel_modules/pmccntr.c[] is oopsing, and even it if weren't, it likely won't give the cycle count since boot since it needs to be activate before it starts counting anything:
1255012566

1255112567
* https://stackoverflow.com/questions/40454157/is-there-an-equivalent-instruction-to-rdtsc-in-arm
1255212568
* https://stackoverflow.com/questions/31620375/arm-cortex-a7-returning-pmccntr-0-in-kernel-mode-and-illegal-instruction-in-u/31649809#31649809
1255312569
* https://blog.regehr.org/archives/794
1255412570

12571+
=== x86 SIMD
12572+
12573+
History:
12574+
12575+
* link:https://en.wikipedia.org/wiki/MMX_(instruction_set)[MMX]: MultiMedia eXtension (unofficial name). 1997. MM0-MM7 64-bit registers.
12576+
* link:https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions[SSE]: Streaming SIMD Extensions. 1999. XMM0-XMM7 128-bit registers, XMM0-XMM15 for AMD in 64-bit mode.
12577+
* link:https://en.wikipedia.org/wiki/SSE2[SSE2]: 2004
12578+
* link:https://en.wikipedia.org/wiki/SSE3[SSE3]: 2006
12579+
* link:https://en.wikipedia.org/wiki/SSE4[SSE4]: 2006
12580+
* link:https://en.wikipedia.org/wiki/Advanced_Vector_Extensions[AVX]: Advanced Vector Extensions. 2011. YMM0–YMM15 256-bit registers in 64-bit mode. Extension of XMM.
12581+
* AVX2:2013
12582+
* AVX-512: 2016. 512-bit ZMM registers. Extension of YMM.
12583+
12584+
==== x86 SSE2
12585+
12586+
===== x86 ADDPD instruction
12587+
12588+
link:userland/arch/x86_64/addpd.S[]: ADDPS, ADDPD
12589+
12590+
Good first instruction to learn SIMD: <<simd-assembly>>
12591+
12592+
===== x86 PADDQ instruction
12593+
12594+
link:userland/arch/x86_64/paddq.S[]: PADDQ, PADDL, PADDW, PADDB
12595+
12596+
Good first instruction to learn SIMD: <<simd-assembly>>
12597+
1255512598
=== x86 assembly bibliography
1255612599

1255712600
==== x86 official bibliography

lkmc.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,12 @@ void lkmc_assert_memcmp(
5656
}
5757
}
5858

59+
void lkmc_print_hex_32(uint32_t x) {
60+
printf("0x%08" PRIX32, x);
61+
}
62+
5963
void lkmc_print_hex_64(uint64_t x) {
60-
printf("0x%016" PRIx64, x);
64+
printf("0x%016" PRIX64, x);
6165
}
6266

6367
void lkmc_print_newline() {
Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,13 @@
11
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-rdtsc-instruction */
22

3+
#include <inttypes.h>
34
#include <stdint.h>
45
#include <stdio.h>
56
#include <stdlib.h>
67

78
#include <x86intrin.h>
89

910
int main(void) {
10-
uintmax_t val;
11-
val = __rdtsc();
12-
printf("%ju\n", val);
11+
printf("0x%016" PRIX64 "\n", (uint64_t)__rdtsc());
1312
return EXIT_SUCCESS;
1413
}
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-rdtscp-instruction */
2+
3+
#include <inttypes.h>
4+
#include <stdint.h>
5+
#include <stdio.h>
6+
#include <stdlib.h>
7+
8+
#include <x86intrin.h>
9+
10+
int main(void) {
11+
uint32_t pid;
12+
printf("0x%016" PRIX64 "\n", (uint64_t)__rdtscp(&pid));
13+
printf("0x%08" PRIX32 "\n", pid);
14+
return EXIT_SUCCESS;
15+
}

userland/arch/x86_64/rdtsc.S

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-rdtsc-instruction */
2+
3+
#include <lkmc.h>
4+
5+
LKMC_PROLOGUE
6+
rdtsc
7+
mov %edx, %edi
8+
shl $32, %rdi
9+
add %rax, %rdi
10+
call lkmc_print_hex_64
11+
call lkmc_print_newline
12+
LKMC_EPILOGUE

userland/arch/x86_64/rdtscp.S

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-rdtscp-instruction */
2+
3+
#include <lkmc.h>
4+
5+
LKMC_PROLOGUE
6+
rdtscp
7+
mov %edx, %edi
8+
shl $32, %rdi
9+
add %rax, %rdi
10+
mov %ecx, %r12d
11+
12+
/* Print RDTSC. */
13+
call lkmc_print_hex_64
14+
call lkmc_print_newline
15+
16+
/* Print PID. */
17+
mov %r12d, %edi
18+
call lkmc_print_hex_32
19+
call lkmc_print_newline
20+
LKMC_EPILOGUE

0 commit comments

Comments
 (0)