-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[IR] LangRef: state explicitly that floats generally behave according to IEEE-754 #102140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
d643f2c
648d3ce
d107aa0
cd80e84
7d4deb8
29f5c14
f909f35
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3572,6 +3572,39 @@ or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the | |
| seq\_cst total orderings of other operations that are not marked | ||
| ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. | ||
|
|
||
| .. _floatsem: | ||
|
|
||
| Floating-Point Semantics | ||
| ------------------------ | ||
|
|
||
| LLVM floating-point types fall into two categories: | ||
|
|
||
| - half, float, double, and fp128, which correspond to the binary16, binary32, | ||
| binary64, and binary128 formats described in the IEEE-754 specification. | ||
| - The remaining types, which do not directly correspond to a standard IEEE | ||
|
||
| format. | ||
|
|
||
| For floating-point operations acting on types with a corresponding IEEE format, | ||
| unless otherwise specified the value returned by that operation matches that of | ||
| the corresponding IEEE-754 operation executed in the :ref:`default | ||
| floating-point environment <floatenv>`, except that the behavior of NaN results | ||
| is instead :ref:`as specified here <floatnan>`. (This statement concerns only | ||
| the returned *value*; we make no statement about status flags or | ||
| traps/exceptions.) In particular, a floating-point instruction returning a | ||
| non-NaN value is guaranteed to always return the same bit-identical result on | ||
|
||
| all machines and optimization levels. | ||
|
||
|
|
||
|
||
| This means that optimizations and backends may not change the observed bitwise | ||
| result of these operations in any way (unless NaNs are returned), and frontends | ||
| can rely on these operations providing perfectly rounded results as described in | ||
| the standard. | ||
|
|
||
| Various flags and attributes can alter the behavior of these operations and thus | ||
| make them not bit-identical across machines and optimization levels any more: | ||
| most notably, the :ref:`fast-math flags <fastmath>` as well as the ``strictfp`` | ||
| and ``denormal-fp-math`` attributes. See their corresponding documentation for | ||
| details. | ||
|
|
||
| .. _floatenv: | ||
|
|
||
| Floating-Point Environment | ||
|
|
@@ -3582,11 +3615,12 @@ status flags are not observable. Therefore, floating-point math operations do | |
| not have side effects and may be speculated freely. Results assume the | ||
| round-to-nearest rounding mode, and subnormals are assumed to be preserved. | ||
|
|
||
| Running LLVM code in an environment where these assumptions are not met can lead | ||
| to undefined behavior. The ``strictfp`` and ``denormal-fp-math`` attributes as | ||
| well as :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` can be used | ||
| to weaken LLVM's assumptions and ensure defined behavior in non-default | ||
| floating-point environments; see their respective documentation for details. | ||
| Running LLVM code in an environment where these assumptions are not met | ||
| typically leads to undefined behavior. The ``strictfp`` and ``denormal-fp-math`` | ||
| attributes as well as :ref:`Constrained Floating-Point Intrinsics | ||
| <constrainedfp>` can be used to weaken LLVM's assumptions and ensure defined | ||
| behavior in non-default floating-point environments; see their respective | ||
| documentation for details. | ||
|
|
||
| .. _floatnan: | ||
|
|
||
|
|
@@ -3608,10 +3642,11 @@ are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and | |
| ``llvm.copysign``. These operations act directly on the underlying bit | ||
| representation and never change anything except possibly for the sign bit. | ||
|
|
||
| For floating-point math operations, unless specified otherwise, the following | ||
| rules apply when a NaN value is returned: the result has a non-deterministic | ||
| sign; the quiet bit and payload are non-deterministically chosen from the | ||
| following set of options: | ||
| Floating-point math operations that return a NaN are an exception from the | ||
| general principle that LLVM implements IEEE-754 semantics. Unless specified | ||
| otherwise, the following rules apply whenever the IEEE-754 semantics say that a | ||
| NaN value is returned: the result has a non-deterministic sign; the quiet bit | ||
| and payload are non-deterministically chosen from the following set of options: | ||
|
|
||
| - The quiet bit is set and the payload is all-zero. ("Preferred NaN" case) | ||
| - The quiet bit is set and the payload is copied from any input operand that is | ||
|
|
@@ -3943,7 +3978,7 @@ Floating-Point Types | |
| - Description | ||
|
|
||
| * - ``half`` | ||
| - 16-bit floating-point value | ||
| - 16-bit floating-point value (IEEE-754 binary16) | ||
|
|
||
| * - ``bfloat`` | ||
| - 16-bit "brain" floating-point value (7-bit significand). Provides the | ||
|
|
@@ -3952,24 +3987,20 @@ Floating-Point Types | |
| extensions and Arm's ARMv8.6-A extensions, among others. | ||
|
|
||
| * - ``float`` | ||
| - 32-bit floating-point value | ||
| - 32-bit floating-point value (IEEE-754 binary32) | ||
|
|
||
| * - ``double`` | ||
| - 64-bit floating-point value | ||
| - 64-bit floating-point value (IEEE-754 binary64) | ||
|
|
||
| * - ``fp128`` | ||
| - 128-bit floating-point value (113-bit significand) | ||
| - 128-bit floating-point value (IEEE-754 binary128) | ||
|
|
||
| * - ``x86_fp80`` | ||
| - 80-bit floating-point value (X87) | ||
|
|
||
| * - ``ppc_fp128`` | ||
| - 128-bit floating-point value (two 64-bits) | ||
|
|
||
| The binary format of half, float, double, and fp128 correspond to the | ||
| IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 | ||
| respectively. | ||
|
|
||
| X86_amx Type | ||
| """""""""""" | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.