You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
** link:userland/arch/x86_64/div_overflow.S[]: DIV overflow
12379
+
** link:userland/arch/x86_64/div_zero.S[]: DIV zero
12380
+
** link:userland/arch/x86_64/idiv.S[]: IDIV
12381
+
* link:userland/arch/x86_64/cmp.S[]: CMP
12371
12382
12372
12383
=== x86 logical instructions
12373
12384
12374
12385
<<intel-manual-1>> 5.1.4 "Logical Instructions"
12375
12386
12376
-
* link:userland/arch/x86_64/and.S[AND]
12377
-
* link:userland/arch/x86_64/not.S[NOT]
12378
-
* link:userland/arch/x86_64/or.S[OR]
12379
-
* link:userland/arch/x86_64/xor.S[XOR]
12387
+
* link:userland/arch/x86_64/and.S[]: AND
12388
+
* link:userland/arch/x86_64/not.S[]: NOT
12389
+
* link:userland/arch/x86_64/or.S[]: OR
12390
+
* link:userland/arch/x86_64/xor.S[]: XOR
12380
12391
12381
12392
=== x86 shift and rotate instructions
12382
12393
@@ -12400,37 +12411,39 @@ Keeps the same sign on right shift.
12400
12411
Not directly exposed in C, for which signed shift is undetermined behavior, but does exist in Java via the `>>>` operator. C compilers can omit it however.
12401
12412
+
12402
12413
SHL and SAL are exactly the same and have the same encoding: https://stackoverflow.com/questions/8373415/difference-between-shl-and-sal-in-80x86/56621271#56621271
12403
-
* link:userland/arch/x86_64/rol.S[ROL and ROR]
12414
+
* link:userland/arch/x86_64/rol.S[]: ROL and ROR
12404
12415
+
12405
12416
Rotates the bit that is going out around to the other side.
12406
-
* link:userland/arch/x86_64/rol.S[RCL and RCR]
12417
+
* link:userland/arch/x86_64/rol.S[]: RCL and RCR
12407
12418
+
12408
12419
Like ROL and ROR, but insert the carry bit instead, which effectively generates a rotation of 8 + 1 bits. TODO application.
12409
12420
12410
12421
=== x86 bit and byte instructions
12411
12422
12412
12423
<<intel-manual-1>> 5.1.6 "Bit and Byte Instructions"
12413
12424
12414
-
* link:userland/arch/x86_64/bt.S[BT]
12425
+
* link:userland/arch/x86_64/bt.S[]: BT
12415
12426
+
12416
12427
Bit test: test if the Nth bit a bit of a register is set and store the result in the CF FLAG.
12417
12428
+
12418
12429
....
12419
12430
CF = reg[N]
12420
12431
....
12421
-
* link:userland/arch/x86_64/btr.S[BTR]
12432
+
* link:userland/arch/x86_64/btr.S[]: BTR
12422
12433
+
12423
12434
Do a BT and then set the bit to 0.
12424
-
* link:userland/arch/x86_64/btc.S[BTC]
12435
+
* link:userland/arch/x86_64/btc.S[]: BTC
12425
12436
+
12426
12437
Do a BT and then swap the value of the tested bit.
12427
-
* link:userland/arch/x86_64/setcc.S[SETcc]
12438
+
* link:userland/arch/x86_64/setcc.S[]: SETcc
12428
12439
+
12429
-
Set a a byte of a register to 0 or 1 depending on the cc condition.
12430
-
* link:userland/arch/x86_64/popcnt.S[POPCNT]
12440
+
Set a byte of a register to 0 or 1 depending on the cc condition.
Jump if certain conditions of the flags register are met.
12453
12466
@@ -12472,29 +12485,61 @@ JG vs JA and JL vs JB:
12472
12485
12473
12486
==== x86 LOOP instruction
12474
12487
12475
-
link:userland/arch/x86_64/loop.S[LOOP]
12488
+
link:userland/arch/x86_64/loop.S[]
12476
12489
12477
12490
Vs <<x86-jcc-instructions,Jcc>>: https://stackoverflow.com/questions/6805692/x86-assembly-programming-loops-with-ecx-and-loop-instruction-versus-jmp-jcond Holy CISC!
These instructions do some operation on an array item, and automatically update the index to the next item:
12484
12497
12485
-
link:userland/arch/x86_64/nop.S[NOP]
12498
+
* First example explained in more detail
12499
+
** link:userland/arch/x86_64/stos.S[]: STOS: STOre String: store register to memory. STOSD is called STOSL in GNU GAS as usual: https://stackoverflow.com/questions/6211629/gcc-inline-assembly-error-no-such-instruction-stosd
12500
+
* Further examples
12501
+
** link:userland/arch/x86_64/cmps.S[]: CMPS: CoMPare Strings: compare two values in memory with addresses given by RSI and RDI. Could be used to implement `memcmp`. Store the result in JZ as usual.
12502
+
** link:userland/arch/x86_64/lods.S[]: LODS: LOaD String: load from memory to register.
12503
+
** link:userland/arch/x86_64/movs.S[]: MOVS: MOV String: move from one memory to another with addresses given by RSI and RDI. Could be used to implement `memmov`.
12504
+
** link:userland/arch/x86_64/scas.S[]: SCAS: SCan String: compare memory to the value in a register. Could be used to implement `strchr`.
12486
12505
12487
-
No OPeration.
12506
+
The RSI and RDI registers are actually named after these intructions! S is the source of string instructions, D is the destination of string instructions.
12488
12507
12489
-
Does nothing except take up one processor cycle and occupy some instruction memory.
12508
+
The direction of the index increment depends on the direction flag of the FLAGS register: 0 means forward and 1 means backward: https://stackoverflow.com/questions/9636691/what-are-cld-and-std-for-in-x86-assembly-language-what-does-df-do
These instructions were originally developed to speed up "string" operations such as those present in the `<string.h>` header of the C standard library.
12511
+
12512
+
However, as computer architecture evolved, those instructions might not offer considerable speedups anymore, and modern glibc such as 2.29 just uses <<x86-simd>> operations instead:, see also: https://stackoverflow.com/questions/33480999/how-can-the-rep-stosb-instruction-execute-faster-than-the-equivalent-loop
12513
+
12514
+
===== x86 REP prefix
12515
+
12516
+
Example: link:userland/arch/x86_64/rep.S[]
12517
+
12518
+
Repeat a string instruction RCX times:
12519
+
12520
+
As the repetitions happen:
12521
+
12522
+
* RCX decreases, until it reaches 0
12523
+
* RDI and RSI increase
12524
+
12525
+
The variants: REPZ, REPNZ (alias REPE, REPNE) repeat a given instruction until something happens.
12526
+
12527
+
REP and REPZ also additionally stop if the comparison operation they repeat fails.
0 commit comments