Skip to content

Commit 0e93d04

Browse files
authored
[AVX10][Doc] Add documentation about AVX10 options and their attentions (llvm#77925)
1 parent 5295ca1 commit 0e93d04

File tree

1 file changed

+54
-0
lines changed

1 file changed

+54
-0
lines changed

clang/docs/UsersManual.rst

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3963,6 +3963,60 @@ implicitly included in later levels.
39633963
- ``-march=x86-64-v3``: (close to Haswell) AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE
39643964
- ``-march=x86-64-v4``: AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL
39653965

3966+
`Intel AVX10 ISA <https://cdrdv2.intel.com/v1/dl/getContent/784267>`_ is
3967+
a major new vector ISA incorporating the modern vectorization aspects of
3968+
Intel AVX-512. This ISA will be supported on all future Intel processors.
3969+
Users are supposed to use the new options ``-mavx10.N`` and ``-mavx10.N-512``
3970+
on these processors and should not use traditional AVX512 options anymore.
3971+
3972+
The ``N`` in ``-mavx10.N`` represents a continuous integer number starting
3973+
from ``1``. ``-mavx10.N`` is an alias of ``-mavx10.N-256``, which means to
3974+
enable all instructions within AVX10 version N at a maximum vector length of
3975+
256 bits. ``-mavx10.N-512`` enables all instructions at a maximum vector
3976+
length of 512 bits, which is a superset of instructions ``-mavx10.N`` enabled.
3977+
3978+
Current binaries built with AVX512 features can run on Intel AVX10/512 capable
3979+
processors without re-compile, but cannot run on AVX10/256 capable processors.
3980+
Users need to re-compile their code with ``-mavx10.N``, and maybe update some
3981+
code that calling to 512-bit X86 specific intrinsics and passing or returning
3982+
512-bit vector types in function call, if they want to run on AVX10/256 capable
3983+
processors. Binaries built with ``-mavx10.N`` can run on both AVX10/256 and
3984+
AVX10/512 capable processors.
3985+
3986+
Users can add a ``-mno-evex512`` in the command line with AVX512 options if
3987+
they want to run the binary on both legacy AVX512 and new AVX10/256 capable
3988+
processors. The option has the same constraints as ``-mavx10.N``, i.e.,
3989+
cannot call to 512-bit X86 specific intrinsics and pass or return 512-bit vector
3990+
types in function call.
3991+
3992+
Users should avoid using AVX512 features in function target attributes when
3993+
developing code for AVX10. If they have to do so, they need to add an explicit
3994+
``evex512`` or ``no-evex512`` together with AVX512 features for 512-bit or
3995+
non-512-bit functions respectively to avoid unexpected code generation. Both
3996+
command line option and target attribute of EVEX512 feature can only be used
3997+
with AVX512. They don't affect vector size of AVX10.
3998+
3999+
User should not mix the use AVX10 and AVX512 options together at any time,
4000+
because the option combinations are conflicting sometimes. For example, a
4001+
combination of ``-mavx512f -mavx10.1-256`` doesn't show a clear intention to
4002+
compiler, since instructions in AVX512F and AVX10.1/256 intersect but do not
4003+
overlap. In this case, compiler will emit warning for it, but the behavior
4004+
is determined. It will generate the same code as option ``-mavx10.1-512``.
4005+
A similar case is ``-mavx512f -mavx10.2-256``, which equals to
4006+
``-mavx10.1-512 -mavx10.2-256``, because ``avx10.2-256`` implies ``avx10.1-256``
4007+
and ``-mavx512f -mavx10.1-256`` equals to ``-mavx10.1-512``.
4008+
4009+
There are some new macros introduced with AVX10 support. ``-mavx10.1-256`` will
4010+
enable ``__AVX10_1__`` and ``__EVEX256__``, while ``-mavx10.1-512`` enables
4011+
``__AVX10_1__``, ``__EVEX256__``, ``__EVEX512__`` and ``__AVX10_1_512__``.
4012+
Besides, both ``-mavx10.1-256`` and ``-mavx10.1-512`` will enable all AVX512
4013+
feature specific macros. A AVX512 feature will enable both ``__EVEX256__``,
4014+
``__EVEX512__`` and its own macro. So ``__EVEX512__`` can be used to guard code
4015+
that can run on both legacy AVX512 and AVX10/512 capable processors but cannot
4016+
run on AVX10/256, while a AVX512 macro like ``__AVX512F__`` cannot tell the
4017+
difference among the three options. Users need to check additional macros
4018+
``__AVX10_1__`` and ``__EVEX512__`` if they want to make distinction.
4019+
39664020
ARM
39674021
^^^
39684022

0 commit comments

Comments
 (0)