@@ -3963,6 +3963,60 @@ implicitly included in later levels.
3963
3963
- ``-march=x86-64-v3 ``: (close to Haswell) AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE
3964
3964
- ``-march=x86-64-v4 ``: AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL
3965
3965
3966
+ `Intel AVX10 ISA <https://cdrdv2.intel.com/v1/dl/getContent/784267 >`_ is
3967
+ a major new vector ISA incorporating the modern vectorization aspects of
3968
+ Intel AVX-512. This ISA will be supported on all future Intel processors.
3969
+ Users are supposed to use the new options ``-mavx10.N `` and ``-mavx10.N-512 ``
3970
+ on these processors and should not use traditional AVX512 options anymore.
3971
+
3972
+ The ``N `` in ``-mavx10.N `` represents a continuous integer number starting
3973
+ from ``1 ``. ``-mavx10.N `` is an alias of ``-mavx10.N-256 ``, which means to
3974
+ enable all instructions within AVX10 version N at a maximum vector length of
3975
+ 256 bits. ``-mavx10.N-512 `` enables all instructions at a maximum vector
3976
+ length of 512 bits, which is a superset of instructions ``-mavx10.N `` enabled.
3977
+
3978
+ Current binaries built with AVX512 features can run on Intel AVX10/512 capable
3979
+ processors without re-compile, but cannot run on AVX10/256 capable processors.
3980
+ Users need to re-compile their code with ``-mavx10.N ``, and maybe update some
3981
+ code that calling to 512-bit X86 specific intrinsics and passing or returning
3982
+ 512-bit vector types in function call, if they want to run on AVX10/256 capable
3983
+ processors. Binaries built with ``-mavx10.N `` can run on both AVX10/256 and
3984
+ AVX10/512 capable processors.
3985
+
3986
+ Users can add a ``-mno-evex512 `` in the command line with AVX512 options if
3987
+ they want to run the binary on both legacy AVX512 and new AVX10/256 capable
3988
+ processors. The option has the same constraints as ``-mavx10.N ``, i.e.,
3989
+ cannot call to 512-bit X86 specific intrinsics and pass or return 512-bit vector
3990
+ types in function call.
3991
+
3992
+ Users should avoid using AVX512 features in function target attributes when
3993
+ developing code for AVX10. If they have to do so, they need to add an explicit
3994
+ ``evex512 `` or ``no-evex512 `` together with AVX512 features for 512-bit or
3995
+ non-512-bit functions respectively to avoid unexpected code generation. Both
3996
+ command line option and target attribute of EVEX512 feature can only be used
3997
+ with AVX512. They don't affect vector size of AVX10.
3998
+
3999
+ User should not mix the use AVX10 and AVX512 options together at any time,
4000
+ because the option combinations are conflicting sometimes. For example, a
4001
+ combination of ``-mavx512f -mavx10.1-256 `` doesn't show a clear intention to
4002
+ compiler, since instructions in AVX512F and AVX10.1/256 intersect but do not
4003
+ overlap. In this case, compiler will emit warning for it, but the behavior
4004
+ is determined. It will generate the same code as option ``-mavx10.1-512 ``.
4005
+ A similar case is ``-mavx512f -mavx10.2-256 ``, which equals to
4006
+ ``-mavx10.1-512 -mavx10.2-256 ``, because ``avx10.2-256 `` implies ``avx10.1-256 ``
4007
+ and ``-mavx512f -mavx10.1-256 `` equals to ``-mavx10.1-512 ``.
4008
+
4009
+ There are some new macros introduced with AVX10 support. ``-mavx10.1-256 `` will
4010
+ enable ``__AVX10_1__ `` and ``__EVEX256__ ``, while ``-mavx10.1-512 `` enables
4011
+ ``__AVX10_1__ ``, ``__EVEX256__ ``, ``__EVEX512__ `` and ``__AVX10_1_512__ ``.
4012
+ Besides, both ``-mavx10.1-256 `` and ``-mavx10.1-512 `` will enable all AVX512
4013
+ feature specific macros. A AVX512 feature will enable both ``__EVEX256__ ``,
4014
+ ``__EVEX512__ `` and its own macro. So ``__EVEX512__ `` can be used to guard code
4015
+ that can run on both legacy AVX512 and AVX10/512 capable processors but cannot
4016
+ run on AVX10/256, while a AVX512 macro like ``__AVX512F__ `` cannot tell the
4017
+ difference among the three options. Users need to check additional macros
4018
+ ``__AVX10_1__ `` and ``__EVEX512__ `` if they want to make distinction.
4019
+
3966
4020
ARM
3967
4021
^^^
3968
4022
0 commit comments