Skip to content

Commit b0cf62d

Browse files
committed
fix acrolinx issues
1 parent 621c7ff commit b0cf62d

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

docs/build/arm64-windows-abi-conventions.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,11 @@ ms.date: 03/25/2025
55
---
66
# Overview of ARM64 ABI conventions
77

8-
The basic application binary interface (ABI) for Windows when compiled and run on ARM processors in 64-bit mode (ARMv8 or later architectures), for the most part, follows ARM's standard AArch64 EABI. This article highlights some of the key assumptions and changes from what is documented in the EABI. For information about the 32-bit ABI, see [Overview of ARM ABI conventions](overview-of-arm-abi-conventions.md). For more information about the standard ARM EABI, see [Application Binary Interface (ABI) for the ARM Architecture](https://github.com/ARM-software/abi-aa) (external link).
8+
The basic application binary interface (ABI) for Windows when compiled and run on ARM processors in 64-bit mode (ARMv8 or later architectures), usually follows ARM's standard AArch64 EABI. This article highlights some of the key assumptions and changes from what is documented in the EABI. For information about the 32-bit ABI, see [Overview of ARM ABI conventions](overview-of-arm-abi-conventions.md). For more information about the standard ARM EABI, see [Application Binary Interface (ABI) for the ARM Architecture](https://github.com/ARM-software/abi-aa) (external link).
99

1010
## Definitions
1111

12-
With the introduction of 64-bit support, ARM has defined several terms:
12+
With the introduction of 64-bit support, ARM defined several terms:
1313

1414
- **AArch32** – the legacy 32-bit instruction set architecture (ISA) defined by ARM, including Thumb mode execution.
1515
- **AArch64** – the new 64-bit instruction set architecture (ISA) defined by ARM.
@@ -19,7 +19,7 @@ With the introduction of 64-bit support, ARM has defined several terms:
1919
Windows also uses these terms:
2020

2121
- **ARM** – refers to the 32-bit ARM architecture (AArch32), sometimes referred to as WoA (Windows on ARM).
22-
- **ARM32** – same as ARM, above; used in this document for clarity.
22+
- **ARM32** – same as **ARM**; used in this document for clarity.
2323
- **ARM64** – refers to the 64-bit ARM architecture (AArch64). There's no such thing as WoA64.
2424

2525
Finally, when referring to data types, the following definitions from ARM are referenced:
@@ -30,7 +30,7 @@ Finally, when referring to data types, the following definitions from ARM are re
3030

3131
## Base requirements
3232

33-
The ARM64 version of Windows presupposes that it's running on an ARMv8 or later architecture at all times. Both floating-point and NEON support are presumed to be present in hardware.
33+
The ARM64 version of Windows presupposes that it's running on an ARMv8 or later architecture always. Both floating-point and NEON support are presumed to be present in hardware.
3434

3535
The ARMv8 specification describes new optional crypto and CRC helper opcodes for both AArch32 and AArch64. Support for them is currently optional, but recommended. To take advantage of these opcodes, apps should first make runtime checks for their existence.
3636

@@ -74,7 +74,7 @@ The AArch64 architecture supports 32 integer registers:
7474
| x18 | N/A | Reserved platform register: in kernel mode, points to KPCR for the current processor; In user mode, points to TEB |
7575
| x19-x28 | Non-volatile | Scratch registers |
7676
| x29/fp | Non-volatile | Frame pointer |
77-
| x30/lr | Both | Link Register: Callee function must preserve it for its own return, but caller's value will be lost. |
77+
| x30/lr | Both | Link Register: Callee function must preserve it for its own return, but caller's value is lost. |
7878

7979
Each register may be accessed as a full 64-bit value (via x0-x30) or as a 32-bit value (via w0-w30). 32-bit operations zero-extend their results up to 64 bits.
8080

@@ -86,7 +86,7 @@ The frame pointer (x29) is required for compatibility with fast stack walking us
8686

8787
## Floating-point/SIMD registers
8888

89-
The AArch64 architecture also supports 32 floating-point/SIMD registers, summarized below:
89+
The AArch64 architecture also supports these 32 floating-point/SIMD registers:
9090

9191
| Register | Volatility | Role |
9292
| - | - | - |
@@ -118,7 +118,7 @@ Like AArch32, the AArch64 specification provides three system-controlled "thread
118118

119119
## Floating-point exceptions
120120

121-
Most ARM hardware doesn't support IEEE floating-point exceptions. You can determine if an ARM CPU supports them by writing a value that enables exceptions to the FPCR register and then reading it back. If the CPU supports floating-point exceptions, the bits corresponding to supported exceptions will remain set, while the bits corresponding to unsupported exceptions will be reset by the CPU.
121+
Most ARM hardware doesn't support IEEE floating-point exceptions. You can determine if an ARM CPU supports them by writing a value that enables exceptions to the FPCR register and then reading it back. If the CPU supports floating-point exceptions, the bits corresponding to supported exceptions remain set, while the bits corresponding to unsupported exceptions are reset by the CPU.
122122

123123
For ARM CPUs that do support IEEE floating-point exceptions, the behavior on Windows is as follows:
124124

@@ -166,7 +166,7 @@ For each argument in the list, the following rules are applied in turn until the
166166

167167
1. If the argument is an HFA, an HVA, a Quad-precision Floating-point or Short Vector Type, then the NSAA is rounded up to the larger of 8 or the Natural Alignment of the argument's type.
168168

169-
1. If the argument is a Half- or Single-precision Floating Point type, then the size of the argument is set to 8 bytes. The effect is as if the argument had been copied to the least significant bits of a 64-bit register, and the remaining bits filled with unspecified values.
169+
1. If the argument is a Half- or Single-precision Floating Point type, then the size of the argument is set to 8 bytes. The effect is as if the argument were copied to the least significant bits of a 64-bit register, and the remaining bits filled with unspecified values.
170170

171171
1. If the argument is an HFA, an HVA, a Half-, Single-, Double-, or Quad-precision Floating-point or Short Vector Type, then the argument is copied to memory at the adjusted NSAA. The NSAA is incremented by the size of the argument. The argument has now been allocated.
172172

@@ -207,7 +207,7 @@ Floating-point values are returned in s0, d0, or v0, as appropriate.
207207
A type is considered to be an HFA or HVA if all of the following hold:
208208

209209
- It's non-empty,
210-
- It doesn't have any non-trivial default or copy constructors, destructors, or assignment operators,
210+
- It doesn't have any nontrivial default or copy constructors, destructors, or assignment operators,
211211
- All of its members have the same HFA or HVA type, or are float, double, or neon types that match the other members' HFA or HVA types.
212212

213213
HVA values with four or fewer elements are returned in s0-s3, d0-d3, or v0-v3, as appropriate.
@@ -218,20 +218,20 @@ Types returned by value are handled differently depending on whether they have c
218218
- they have a trivial copy-assignment operator, and
219219
- they have a trivial destructor,
220220

221-
and are returned by non-member functions or static member functions, use the following return style:
221+
and are returned by nonmember functions or static member functions, use the following return style:
222222

223223
- Types that are HFAs with four or fewer elements are returned in s0-s3, d0-d3, or v0-v3, as appropriate.
224224
- Types less than or equal to 8 bytes are returned in x0.
225225
- Types less than or equal to 16 bytes are returned in x0 and x1, with x0 containing the lower-order 8 bytes.
226-
- For other aggregate types, the caller shall reserve a block of memory of sufficient size and alignment to hold the result. The address of the memory block shall be passed as an additional argument to the function in x8. The callee may modify the result memory block at any point during the execution of the subroutine. The callee isn't required to preserve the value stored in x8.
226+
- For other aggregate types, the caller shall reserve a block of memory of sufficient size and alignment to hold the result. The address of the memory block shall be passed as another argument to the function in x8. The callee may modify the result memory block at any point during the execution of the subroutine. The callee isn't required to preserve the value stored in x8.
227227

228228
All other types use this convention:
229229

230230
- The caller shall reserve a block of memory of sufficient size and alignment to hold the result. The address of the memory block shall be passed as an additional argument to the function in x0, or x1 if $this is passed in x0. The callee may modify the result memory block at any point during the execution of the subroutine. The callee returns the address of the memory block in x0.
231231

232232
## Stack
233233

234-
Following the ABI put forth by ARM, the stack must remain 16-byte aligned at all times. AArch64 contains a hardware feature that generates stack alignment faults whenever the SP isn't 16-byte aligned and an SP-relative load or store is done. Windows runs with this feature enabled at all times.
234+
Following the ABI put forth by ARM, the stack must always remain 16-byte aligned. AArch64 contains a hardware feature that generates stack alignment faults whenever the SP isn't 16-byte aligned and an SP-relative load or store is done. Windows always runs with this feature enabled.
235235

236236
Functions that allocate 4k or more worth of stack must ensure that each page prior to the final page is touched in order. This action ensures no code can "leap over" the guard pages that Windows uses to expand the stack. Typically the touching is done by the `__chkstk` helper, which has a custom calling convention that passes the total stack allocation divided by 16 in x15.
237237

@@ -249,7 +249,7 @@ Code within Windows is compiled with frame pointers enabled ([/Oy-](reference/oy
249249

250250
## Exception unwinding
251251

252-
Unwinding during exception handling is assisted through the use of unwind codes. The unwind codes are a sequence of bytes stored in the .xdata section of the executable. They describe the operation of the prologue and epilogue in an abstract manner, such that the effects of a function's prologue can be undone in preparation for backing up to the caller's stack frame. For more information on the unwind codes, see [ARM64 exception handling](arm64-exception-handling.md).
252+
Unwinding during exception handling is assisted by using unwind codes. The unwind codes are a sequence of bytes stored in the .xdata section of the executable. They describe the operation of the prologue and epilogue in an abstract manner, such that the effects of a function's prologue can be undone in preparation for backing up to the caller's stack frame. For more information on the unwind codes, see [ARM64 exception handling](arm64-exception-handling.md).
253253

254254
The ARM EABI also specifies an exception unwinding model that uses unwind codes. However, the specification as presented is insufficient for unwinding in Windows, which must handle cases where the PC is in the middle of a function prologue or epilogue.
255255

@@ -259,7 +259,7 @@ Code that is dynamically generated should be described with dynamic function tab
259259

260260
All ARMv8 CPUs are required to support a cycle counter register, a 64-bit register that Windows configures to be readable at any exception level, including user mode. It can be accessed via the special PMCCNTR_EL0 register, using the MSR opcode in assembly code, or the `_ReadStatusReg` intrinsic in C/C++ code.
261261

262-
The cycle counter here is a true cycle counter, not a wall clock. The counting frequency will vary with the processor frequency. If you feel you must know the frequency of the cycle counter, you shouldn't be using the cycle counter. Instead, you want to measure wall clock time, for which you should use `QueryPerformanceCounter`.
262+
The cycle counter here is a true cycle counter, not a wall clock. The counting frequency varies with the processor frequency. If you feel you must know the frequency of the cycle counter, you shouldn't be using the cycle counter. Instead, you want to measure wall clock time, for which you should use `QueryPerformanceCounter`.
263263

264264
## See also
265265

0 commit comments

Comments
 (0)