-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[X86] Sync multiversion features with libgcc and refactor internal feature tables #168750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] Sync multiversion features with libgcc and refactor internal feature tables #168750
Conversation
|
@llvm/pr-subscribers-backend-x86 Author: Mikołaj Piróg (mikolaj-pirog) ChangesCompiler-rt internal feature table is synced with the one in libgcc (common/config/i386/i386-cpuinfo.h). LLVM internal feature table is refactored to include a field ABI_VALUE, so we won't be relying on ordering to keep the values correct. The table is also synced to the one in compiler-rt. I've included MICROARCH_LEVEL under FEATURE_COMPAT to simplify things -- to me knowledge behavior is the same, this distinction wasn't used for anything. Patch is 24.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168750.diff 5 Files Affected:
diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp
index 7a90c89dd7dc0..30970f5d91f6a 100644
--- a/clang/lib/Basic/Targets/X86.cpp
+++ b/clang/lib/Basic/Targets/X86.cpp
@@ -1302,15 +1302,14 @@ bool X86TargetInfo::hasFeature(StringRef Feature) const {
// X86TargetInfo::hasFeature for a somewhat comprehensive list).
bool X86TargetInfo::validateCpuSupports(StringRef FeatureStr) const {
return llvm::StringSwitch<bool>(FeatureStr)
-#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY) .Case(STR, true)
-#define X86_MICROARCH_LEVEL(ENUM, STR, PRIORITY) .Case(STR, true)
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY, ABI_LEVEL) .Case(STR, true)
#include "llvm/TargetParser/X86TargetParser.def"
.Default(false);
}
static llvm::X86::ProcessorFeatures getFeature(StringRef Name) {
return llvm::StringSwitch<llvm::X86::ProcessorFeatures>(Name)
-#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY) \
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY, ABI_VALUE) \
.Case(STR, llvm::X86::FEATURE_##ENUM)
#include "llvm/TargetParser/X86TargetParser.def"
diff --git a/compiler-rt/lib/builtins/cpu_model/x86.c b/compiler-rt/lib/builtins/cpu_model/x86.c
index b4b60986022d4..4a0deb8ea91f3 100644
--- a/compiler-rt/lib/builtins/cpu_model/x86.c
+++ b/compiler-rt/lib/builtins/cpu_model/x86.c
@@ -135,13 +135,9 @@ enum ProcessorFeatures {
FEATURE_AVX512BW,
FEATURE_AVX512DQ,
FEATURE_AVX512CD,
- FEATURE_AVX512ER,
- FEATURE_AVX512PF,
- FEATURE_AVX512VBMI,
+ FEATURE_AVX512VBMI = 26,
FEATURE_AVX512IFMA,
- FEATURE_AVX5124VNNIW,
- FEATURE_AVX5124FMAPS,
- FEATURE_AVX512VPOPCNTDQ,
+ FEATURE_AVX512VPOPCNTDQ = 30,
FEATURE_AVX512VBMI2,
FEATURE_GFNI,
FEATURE_VPCLMULQDQ,
@@ -181,8 +177,7 @@ enum ProcessorFeatures {
// FEATURE_OSXSAVE,
FEATURE_PCONFIG = 63,
FEATURE_PKU,
- FEATURE_PREFETCHWT1,
- FEATURE_PRFCHW,
+ FEATURE_PRFCHW = 66,
FEATURE_PTWRITE,
FEATURE_RDPID,
FEATURE_RDRND,
@@ -231,7 +226,11 @@ enum ProcessorFeatures {
FEATURE_USERMSR,
FEATURE_AVX10_1 = 114,
FEATURE_AVX10_2 = 116,
+ FEATURE_AMX_AVX512,
+ FEATURE_AMX_TF32,
+ FEATURE_AMX_FP8 = 120,
FEATURE_MOVRS,
+ FEATURE_AMX_MOVRS,
CPU_FEATURE_MAX
};
@@ -1088,6 +1087,16 @@ static void getAvailableFeatures(unsigned ECX, unsigned EDX, unsigned MaxLeaf,
if (HasLeafD && ((EAX >> 3) & 1) && HasAVXSave)
setFeature(FEATURE_XSAVES);
+ bool HasLeaf1E = MaxLevel >= 0x1e && !getX86CpuIDAndInfoEx(0x1e, 0x1, &EAX, &EBX, &ECX, &EDX);
+ if (HasLeaf1E && (EAX & 0x10))
+ setFeature(FEATURE_AMX_FP8);
+ if (HasLeaf1E && (EAX & 0x40))
+ setFeature(FEATURE_AMX_TF32);
+ if (HasLeaf1E && (EAX & 0x80))
+ setFeature(FEATURE_AMX_AVX512);
+ if (HasLeaf1E && (EAX & 0x100))
+ setFeature(FEATURE_AMX_MOVRS);
+
bool HasLeaf24 =
MaxLevel >= 0x24 && !getX86CpuIDAndInfo(0x24, &EAX, &EBX, &ECX, &EDX);
if (HasLeaf7Subleaf1 && ((EDX >> 19) & 1) && HasLeaf24) {
diff --git a/llvm/include/llvm/TargetParser/X86TargetParser.def b/llvm/include/llvm/TargetParser/X86TargetParser.def
index 826752b088bcd..5628edac0fd1b 100644
--- a/llvm/include/llvm/TargetParser/X86TargetParser.def
+++ b/llvm/include/llvm/TargetParser/X86TargetParser.def
@@ -121,156 +121,148 @@ X86_CPU_SUBTYPE_ALIAS(INTEL_COREI7_PANTHERLAKE, "wildcatlake")
#undef X86_CPU_SUBTYPE_ALIAS
#undef X86_CPU_SUBTYPE
-// This macro is used for cpu types present in compiler-rt/libgcc. The third
-// parameter PRIORITY is as required by the attribute 'target' checking. Note
-// that not all are supported/prioritized by GCC, so synchronization with GCC's
-// implementation may require changing some existing values.
-//
-// We cannot just re-sort the list though because its order is dictated by the
-// order of bits in CodeGenFunction::GetX86CpuSupportsMask.
-// We cannot re-adjust the position of X86_FEATURE_COMPAT at the whole list.
+// X86_FEATURE_COMPAT is used for cpu types present in compiler-rt/libgcc (i.e.
+// types we can multiversion on). The third parameter PRIORITY is required
+// by the attribute 'target' checking.
+
+// Order of bits has to match what's implemented in compiler-rt/libgcc. That's what the
+// ABI_VALUE is for - CodeGenFunction::GetX86CpuSupportsMask uses it.
#ifndef X86_FEATURE_COMPAT
-#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY) X86_FEATURE(ENUM, STR)
+#define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY, ABI_VALUE) X86_FEATURE(ENUM, STR)
#endif
#ifndef X86_FEATURE
#define X86_FEATURE(ENUM, STR)
#endif
-#ifndef X86_MICROARCH_LEVEL
-#define X86_MICROARCH_LEVEL(ENUM, STR, PRIORITY)
-#endif
-
-X86_FEATURE_COMPAT(CMOV, "cmov", 0)
-X86_FEATURE_COMPAT(MMX, "mmx", 1)
-X86_FEATURE_COMPAT(POPCNT, "popcnt", 9)
-X86_FEATURE_COMPAT(SSE, "sse", 2)
-X86_FEATURE_COMPAT(SSE2, "sse2", 3)
-X86_FEATURE_COMPAT(SSE3, "sse3", 4)
-X86_FEATURE_COMPAT(SSSE3, "ssse3", 5)
-X86_FEATURE_COMPAT(SSE4_1, "sse4.1", 7)
-X86_FEATURE_COMPAT(SSE4_2, "sse4.2", 8)
-X86_FEATURE_COMPAT(AVX, "avx", 12)
-X86_FEATURE_COMPAT(AVX2, "avx2", 18)
-X86_FEATURE_COMPAT(SSE4_A, "sse4a", 6)
-X86_FEATURE_COMPAT(FMA4, "fma4", 14)
-X86_FEATURE_COMPAT(XOP, "xop", 15)
-X86_FEATURE_COMPAT(FMA, "fma", 16)
-X86_FEATURE_COMPAT(AVX512F, "avx512f", 19)
-X86_FEATURE_COMPAT(BMI, "bmi", 13)
-X86_FEATURE_COMPAT(BMI2, "bmi2", 17)
-X86_FEATURE_COMPAT(AES, "aes", 10)
-X86_FEATURE_COMPAT(PCLMUL, "pclmul", 11)
-X86_FEATURE_COMPAT(AVX512VL, "avx512vl", 20)
-X86_FEATURE_COMPAT(AVX512BW, "avx512bw", 21)
-X86_FEATURE_COMPAT(AVX512DQ, "avx512dq", 22)
-X86_FEATURE_COMPAT(AVX512CD, "avx512cd", 23)
+X86_FEATURE_COMPAT(CMOV, "cmov", 0, 0)
+X86_FEATURE_COMPAT(MMX, "mmx", 1, 1)
+X86_FEATURE_COMPAT(POPCNT, "popcnt", 9, 2)
+X86_FEATURE_COMPAT(SSE, "sse", 2, 3)
+X86_FEATURE_COMPAT(SSE2, "sse2", 3, 4)
+X86_FEATURE_COMPAT(SSE3, "sse3", 4, 5)
+X86_FEATURE_COMPAT(SSSE3, "ssse3", 5, 6)
+X86_FEATURE_COMPAT(SSE4_1, "sse4.1", 7, 7)
+X86_FEATURE_COMPAT(SSE4_2, "sse4.2", 8, 8)
+X86_FEATURE_COMPAT(AVX, "avx", 12, 9)
+X86_FEATURE_COMPAT(AVX2, "avx2", 18, 10)
+X86_FEATURE_COMPAT(SSE4_A, "sse4a", 6, 11)
+X86_FEATURE_COMPAT(FMA4, "fma4", 14, 12)
+X86_FEATURE_COMPAT(XOP, "xop", 15, 13)
+X86_FEATURE_COMPAT(FMA, "fma", 16, 14)
+X86_FEATURE_COMPAT(AVX512F, "avx512f", 19, 15)
+X86_FEATURE_COMPAT(BMI, "bmi", 13, 16)
+X86_FEATURE_COMPAT(BMI2, "bmi2", 17, 17)
+X86_FEATURE_COMPAT(AES, "aes", 10, 18)
+X86_FEATURE_COMPAT(PCLMUL, "pclmul", 11, 19)
+X86_FEATURE_COMPAT(AVX512VL, "avx512vl", 20, 20)
+X86_FEATURE_COMPAT(AVX512BW, "avx512bw", 21, 21)
+X86_FEATURE_COMPAT(AVX512DQ, "avx512dq", 22, 22)
+X86_FEATURE_COMPAT(AVX512CD, "avx512cd", 23, 23)
X86_FEATURE (NF, "nf")
X86_FEATURE (CF, "cf")
-X86_FEATURE_COMPAT(AVX512VBMI, "avx512vbmi", 24)
-X86_FEATURE_COMPAT(AVX512IFMA, "avx512ifma", 25)
-X86_FEATURE_COMPAT(AVX5124VNNIW, "avx5124vnniw", 26)
-X86_FEATURE_COMPAT(AVX5124FMAPS, "avx5124fmaps", 27)
-X86_FEATURE_COMPAT(AVX512VPOPCNTDQ, "avx512vpopcntdq", 28)
-X86_FEATURE_COMPAT(AVX512VBMI2, "avx512vbmi2", 29)
-X86_FEATURE_COMPAT(GFNI, "gfni", 30)
-X86_FEATURE_COMPAT(VPCLMULQDQ, "vpclmulqdq", 31)
-X86_FEATURE_COMPAT(AVX512VNNI, "avx512vnni", 32)
-X86_FEATURE_COMPAT(AVX512BITALG, "avx512bitalg", 33)
-X86_FEATURE_COMPAT(AVX512BF16, "avx512bf16", 34)
-X86_FEATURE_COMPAT(AVX512VP2INTERSECT, "avx512vp2intersect", 35)
+X86_FEATURE_COMPAT(AVX512VBMI, "avx512vbmi", 24, 26)
+X86_FEATURE_COMPAT(AVX512IFMA, "avx512ifma", 25, 27)
+X86_FEATURE(AVX5124VNNIW, "avx5124vnniw")
+X86_FEATURE(AVX5124FMAPS, "avx5124fmaps")
+X86_FEATURE_COMPAT(AVX512VPOPCNTDQ, "avx512vpopcntdq", 26, 30)
+X86_FEATURE_COMPAT(AVX512VBMI2, "avx512vbmi2", 27, 31)
+X86_FEATURE_COMPAT(GFNI, "gfni", 28, 32)
+X86_FEATURE_COMPAT(VPCLMULQDQ, "vpclmulqdq", 29, 33)
+X86_FEATURE_COMPAT(AVX512VNNI, "avx512vnni", 30, 34)
+X86_FEATURE_COMPAT(AVX512BITALG, "avx512bitalg", 31, 35)
+X86_FEATURE_COMPAT(AVX512BF16, "avx512bf16", 32, 36)
+X86_FEATURE_COMPAT(AVX512VP2INTERSECT, "avx512vp2intersect", 33, 37)
// Below Features has some missings comparing to gcc, it's because gcc has some
// not one-to-one mapped in llvm.
-// FIXME: dummy features were added to keep the numeric values of later features
-// stable. Since the values need to be ABI stable, they should be changed to
-// have explicitly assigned values, and then these dummy features removed.
-X86_FEATURE (DUMMYFEATURE1, "__dummyfeature1")
-X86_FEATURE (DUMMYFEATURE2, "__dummyfeature2")
-X86_FEATURE_COMPAT(ADX, "adx", 0)
+X86_FEATURE_COMPAT(ADX, "adx", 0, 40)
X86_FEATURE (64BIT, "64bit")
-X86_FEATURE_COMPAT(CLDEMOTE, "cldemote", 0)
-X86_FEATURE_COMPAT(CLFLUSHOPT, "clflushopt", 0)
-X86_FEATURE_COMPAT(CLWB, "clwb", 0)
-X86_FEATURE_COMPAT(CLZERO, "clzero", 0)
-X86_FEATURE_COMPAT(CMPXCHG16B, "cx16", 0)
+X86_FEATURE_COMPAT(CLDEMOTE, "cldemote", 0, 42)
+X86_FEATURE_COMPAT(CLFLUSHOPT, "clflushopt", 0, 43)
+X86_FEATURE_COMPAT(CLWB, "clwb", 0, 44)
+X86_FEATURE_COMPAT(CLZERO, "clzero", 0, 45)
+X86_FEATURE_COMPAT(CMPXCHG16B, "cx16", 0, 46)
X86_FEATURE (CMPXCHG8B, "cx8")
-X86_FEATURE_COMPAT(ENQCMD, "enqcmd", 0)
-X86_FEATURE_COMPAT(F16C, "f16c", 0)
-X86_FEATURE_COMPAT(FSGSBASE, "fsgsbase", 0)
+X86_FEATURE_COMPAT(ENQCMD, "enqcmd", 0, 48)
+X86_FEATURE_COMPAT(F16C, "f16c", 0, 49)
+X86_FEATURE_COMPAT(FSGSBASE, "fsgsbase", 0, 50)
X86_FEATURE (CRC32, "crc32")
X86_FEATURE (INVPCID, "invpcid")
X86_FEATURE (RDPRU, "rdpru")
-X86_FEATURE (SAHF, "sahf")
+X86_FEATURE_COMPAT(SAHF, "sahf", 0, 54)
X86_FEATURE (VZEROUPPER, "vzeroupper")
-X86_FEATURE_COMPAT(LWP, "lwp", 0)
-X86_FEATURE_COMPAT(LZCNT, "lzcnt", 0)
-X86_FEATURE_COMPAT(MOVBE, "movbe", 0)
-X86_FEATURE_COMPAT(MOVDIR64B, "movdir64b", 0)
-X86_FEATURE_COMPAT(MOVDIRI, "movdiri", 0)
-X86_FEATURE_COMPAT(MWAITX, "mwaitx", 0)
+X86_FEATURE_COMPAT(LWP, "lwp", 0, 56)
+X86_FEATURE_COMPAT(LZCNT, "lzcnt", 0, 57)
+X86_FEATURE_COMPAT(MOVBE, "movbe", 0, 58)
+X86_FEATURE_COMPAT(MOVDIR64B, "movdir64b", 0, 59)
+X86_FEATURE_COMPAT(MOVDIRI, "movdiri", 0, 60)
+X86_FEATURE_COMPAT(MWAITX, "mwaitx", 0, 61)
X86_FEATURE (X87, "x87")
-X86_FEATURE_COMPAT(PCONFIG, "pconfig", 0)
-X86_FEATURE_COMPAT(PKU, "pku", 0)
+X86_FEATURE_COMPAT(PCONFIG, "pconfig", 0, 63)
+X86_FEATURE_COMPAT(PKU, "pku", 0, 64)
X86_FEATURE (EVEX512, "evex512")
-X86_FEATURE_COMPAT(PRFCHW, "prfchw", 0)
-X86_FEATURE_COMPAT(PTWRITE, "ptwrite", 0)
-X86_FEATURE_COMPAT(RDPID, "rdpid", 0)
-X86_FEATURE_COMPAT(RDRND, "rdrnd", 0)
-X86_FEATURE_COMPAT(RDSEED, "rdseed", 0)
-X86_FEATURE_COMPAT(RTM, "rtm", 0)
-X86_FEATURE_COMPAT(SERIALIZE, "serialize", 0)
-X86_FEATURE_COMPAT(SGX, "sgx", 0)
-X86_FEATURE_COMPAT(SHA, "sha", 0)
-X86_FEATURE_COMPAT(SHSTK, "shstk", 0)
-X86_FEATURE_COMPAT(TBM, "tbm", 0)
-X86_FEATURE_COMPAT(TSXLDTRK, "tsxldtrk", 0)
-X86_FEATURE_COMPAT(VAES, "vaes", 0)
-X86_FEATURE_COMPAT(WAITPKG, "waitpkg", 0)
-X86_FEATURE_COMPAT(WBNOINVD, "wbnoinvd", 0)
-X86_FEATURE_COMPAT(XSAVE, "xsave", 0)
-X86_FEATURE_COMPAT(XSAVEC, "xsavec", 0)
-X86_FEATURE_COMPAT(XSAVEOPT, "xsaveopt", 0)
-X86_FEATURE_COMPAT(XSAVES, "xsaves", 0)
-X86_FEATURE_COMPAT(AMX_TILE, "amx-tile", 0)
-X86_FEATURE_COMPAT(AMX_INT8, "amx-int8", 0)
-X86_FEATURE_COMPAT(AMX_BF16, "amx-bf16", 0)
-X86_FEATURE_COMPAT(UINTR, "uintr", 0)
-X86_FEATURE_COMPAT(HRESET, "hreset", 0)
-X86_FEATURE_COMPAT(KL, "kl", 0)
+X86_FEATURE_COMPAT(PRFCHW, "prfchw", 0, 66)
+X86_FEATURE_COMPAT(PTWRITE, "ptwrite", 0, 67)
+X86_FEATURE_COMPAT(RDPID, "rdpid", 0, 68)
+X86_FEATURE_COMPAT(RDRND, "rdrnd", 0, 69)
+X86_FEATURE_COMPAT(RDSEED, "rdseed", 0, 70)
+X86_FEATURE_COMPAT(RTM, "rtm", 0, 71)
+X86_FEATURE_COMPAT(SERIALIZE, "serialize", 0, 72)
+X86_FEATURE_COMPAT(SGX, "sgx", 0, 73)
+X86_FEATURE_COMPAT(SHA, "sha", 0, 74)
+X86_FEATURE_COMPAT(SHSTK, "shstk", 0, 75)
+X86_FEATURE_COMPAT(TBM, "tbm", 0, 76)
+X86_FEATURE_COMPAT(TSXLDTRK, "tsxldtrk", 0, 77)
+X86_FEATURE_COMPAT(VAES, "vaes", 0, 78)
+X86_FEATURE_COMPAT(WAITPKG, "waitpkg", 0, 79)
+X86_FEATURE_COMPAT(WBNOINVD, "wbnoinvd", 0, 80)
+X86_FEATURE_COMPAT(XSAVE, "xsave", 0, 81)
+X86_FEATURE_COMPAT(XSAVEC, "xsavec", 0, 82)
+X86_FEATURE_COMPAT(XSAVEOPT, "xsaveopt", 0, 83)
+X86_FEATURE_COMPAT(XSAVES, "xsaves", 0, 84)
+X86_FEATURE_COMPAT(AMX_TILE, "amx-tile", 0, 85)
+X86_FEATURE_COMPAT(AMX_INT8, "amx-int8", 0, 86)
+X86_FEATURE_COMPAT(AMX_BF16, "amx-bf16", 0, 87)
+X86_FEATURE_COMPAT(UINTR, "uintr", 0, 88)
+X86_FEATURE_COMPAT(HRESET, "hreset", 0, 89)
+X86_FEATURE_COMPAT(KL, "kl", 0, 90)
X86_FEATURE (FXSR, "fxsr")
-X86_FEATURE_COMPAT(WIDEKL, "widekl", 0)
-X86_FEATURE_COMPAT(AVXVNNI, "avxvnni", 0)
-X86_FEATURE_COMPAT(AVX512FP16, "avx512fp16", 0)
+X86_FEATURE_COMPAT(WIDEKL, "widekl", 0, 92)
+X86_FEATURE_COMPAT(AVXVNNI, "avxvnni", 0, 93)
+X86_FEATURE_COMPAT(AVX512FP16, "avx512fp16", 0, 94)
+X86_FEATURE_COMPAT(X86_64_BASELINE,"x86-64", 0, 95)
+X86_FEATURE_COMPAT(X86_64_V2, "x86-64-v2", 0, 96)
+X86_FEATURE_COMPAT(X86_64_V3, "x86-64-v3", 0, 97)
+X86_FEATURE_COMPAT(X86_64_V4, "x86-64-v4", 0, 98)
X86_FEATURE (CCMP, "ccmp")
X86_FEATURE (Push2Pop2, "push2pop2")
X86_FEATURE (PPX, "ppx")
X86_FEATURE (NDD, "ndd")
-X86_FEATURE_COMPAT(AVXIFMA, "avxifma", 0)
-X86_FEATURE_COMPAT(AVXVNNIINT8, "avxvnniint8", 0)
-X86_FEATURE_COMPAT(AVXNECONVERT, "avxneconvert", 0)
-X86_FEATURE_COMPAT(CMPCCXADD, "cmpccxadd", 0)
-X86_FEATURE_COMPAT(AMX_FP16, "amx-fp16", 0)
-X86_FEATURE_COMPAT(PREFETCHI, "prefetchi", 0)
-X86_FEATURE_COMPAT(RAOINT, "raoint", 0)
-X86_FEATURE_COMPAT(AMX_COMPLEX, "amx-complex", 0)
-X86_FEATURE_COMPAT(AVXVNNIINT16, "avxvnniint16", 0)
-X86_FEATURE_COMPAT(SM3, "sm3", 0)
-X86_FEATURE_COMPAT(SHA512, "sha512", 0)
-X86_FEATURE_COMPAT(SM4, "sm4", 0)
+X86_FEATURE_COMPAT(AVXIFMA, "avxifma", 0, 99)
+X86_FEATURE_COMPAT(AVXVNNIINT8, "avxvnniint8", 0, 100)
+X86_FEATURE_COMPAT(AVXNECONVERT, "avxneconvert", 0, 101)
+X86_FEATURE_COMPAT(CMPCCXADD, "cmpccxadd", 0, 102)
+X86_FEATURE_COMPAT(AMX_FP16, "amx-fp16", 0, 103)
+X86_FEATURE_COMPAT(PREFETCHI, "prefetchi", 0, 104)
+X86_FEATURE_COMPAT(RAOINT, "raoint", 0, 105)
+X86_FEATURE_COMPAT(AMX_COMPLEX, "amx-complex", 0, 106)
+X86_FEATURE_COMPAT(AVXVNNIINT16, "avxvnniint16", 0, 107)
+X86_FEATURE_COMPAT(SM3, "sm3", 0, 108)
+X86_FEATURE_COMPAT(SHA512, "sha512", 0, 109)
+X86_FEATURE_COMPAT(SM4, "sm4", 0, 110)
+X86_FEATURE_COMPAT(APXF, "apxf", 0, 111)
X86_FEATURE (EGPR, "egpr")
-X86_FEATURE_COMPAT(USERMSR, "usermsr", 0)
-X86_FEATURE_COMPAT(AVX10_1, "avx10.1", 36)
-X86_FEATURE (DUMMYFEATURE3, "__dummyfeature3")
-X86_FEATURE_COMPAT(AVX10_2, "avx10.2", 37)
-X86_FEATURE (DUMMYFEATURE4, "__dummyfeature4")
-//FIXME: make MOVRS _COMPAT defined when gcc landed relate patch.
-X86_FEATURE (MOVRS, "movrs")
-X86_FEATURE (ZU, "zu")
-X86_FEATURE (AMX_FP8, "amx-fp8")
-X86_FEATURE (AMX_MOVRS, "amx-movrs")
-X86_FEATURE (AMX_AVX512, "amx-avx512")
-X86_FEATURE (AMX_TF32, "amx-tf32")
+X86_FEATURE_COMPAT(USERMSR, "usermsr", 0, 112)
+X86_FEATURE_COMPAT(AVX10_1, "avx10.1", 34, 114)
+X86_FEATURE_COMPAT(AVX10_2, "avx10.2", 35, 116)
+X86_FEATURE_COMPAT(AMX_AVX512, "amx-avx512", 0, 117)
+X86_FEATURE_COMPAT(AMX_TF32, "amx-tf32", 0, 118)
+X86_FEATURE_COMPAT(AMX_FP8, "amx-fp8", 0, 120)
+X86_FEATURE_COMPAT(MOVRS, "movrs", 0, 121)
+X86_FEATURE_COMPAT(AMX_MOVRS, "amx-movrs", 0, 122)
+X86_FEATURE(ZU, "zu")
+
// These features aren't really CPU features, but the frontend can set them.
X86_FEATURE (RETPOLINE_EXTERNAL_THUNK, "retpoline-external-thunk")
X86_FEATURE (RETPOLINE_INDIRECT_BRANCHES, "retpoline-indirect-branches")
@@ -278,11 +270,5 @@ X86_FEATURE (RETPOLINE_INDIRECT_CALLS, "...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
These changes (sync with libgcc and refactor) could be treated as a separate, but I opted to do them in one go, since if I did update before refactor I would have to proceed with hacky way of maintaining the ordering only to remove it immediately later -- the other way around I would be encoding wrong bit values of features. If need be I can split to refactor and update part. |
| unsigned Priorities[] = { | ||
| #include "llvm/TargetParser/X86TargetParser.def" | ||
| }; | ||
| std::array<unsigned, std::size(Priorities)> HelperList; | ||
| const size_t MaxPriority = 37; | ||
| const size_t MaxPriority = 35; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should hoist this into the def file.
🐧 Linux x64 Test Results
|
Compile failures are relevant? |
|
Failures were related, they are fixed now |
| constexpr FeatureBitset ImpliedFeaturesX86_64_BASELINE = {}; | ||
| constexpr FeatureBitset ImpliedFeaturesX86_64_V2 = {}; | ||
| constexpr FeatureBitset ImpliedFeaturesX86_64_V3 = {}; | ||
| constexpr FeatureBitset ImpliedFeaturesX86_64_V4 = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to define these? They are not single features. See X86_64VXFeatures in X86.td.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are defined because of array on line 667:
constexpr FeatureInfo FeatureInfos[CPU_FEATURE_MAX] = {
#define X86_FEATURE(ENUM, STR) {{"+" STR}, ImpliedFeatures##ENUM},
#include "llvm/TargetParser/X86TargetParser.def"
};Because MICROARCH_LEVEL macro was merged to X86_FEATURE_COMPAT, they are now included in this table. I believe this is cleaner solution than having a separate macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the problem. They are microarch instead of feature. Just like we cannot put novalake in the feature list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think we cannot easily do without MIRCOARCH macro then. I've restored it
| static llvm::X86::ProcessorFeatures getFeature(StringRef Name) { | ||
| return llvm::StringSwitch<llvm::X86::ProcessorFeatures>(Name) | ||
| #define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY) \ | ||
| #define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY, ABI_VALUE) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ABI or API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with both names -- I've picked ABI instead of API since it's low-level information on how features are stored in a table for multiversion functions
| .Case(STR, llvm::X86::FEATURE_##ENUM) | ||
| #define X86_MICROARCH_LEVEL(ENUM, STR, PRIORITY) \ | ||
| .Case(STR, llvm::X86::FEATURE_##ENUM) | ||
| #define X86_FEATURE_COMPAT(ENUM, STR, PRIORITY, ABI_VALUE) .Case(STR, ABI_VALUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes ENUM useless, since we can always use ABI_VALUE instead. But I still like the ENUM for readability.
| FEATURE_PREFETCHWT1, | ||
| FEATURE_PRFCHW, | ||
| FEATURE_PRFCHW = 66, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about commenting out as above instead of deleting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we comment out features not supported by llvm but supported gcc. This feature is not supported by gcc, so no comment
| X86_FEATURE_COMPAT(AVX512IFMA, "avx512ifma", 25, 27) | ||
| X86_FEATURE(AVX5124VNNIW, "avx5124vnniw") | ||
| X86_FEATURE(AVX5124FMAPS, "avx5124fmaps") | ||
| X86_FEATURE_COMPAT(AVX512VPOPCNTDQ, "avx512vpopcntdq", 26, 30) | ||
| X86_FEATURE_COMPAT(AVX512VBMI2, "avx512vbmi2", 27, 31) | ||
| X86_FEATURE_COMPAT(GFNI, "gfni", 28, 32) | ||
| X86_FEATURE_COMPAT(VPCLMULQDQ, "vpclmulqdq", 29, 33) | ||
| X86_FEATURE_COMPAT(AVX512VNNI, "avx512vnni", 30, 34) | ||
| X86_FEATURE_COMPAT(AVX512BITALG, "avx512bitalg", 31, 35) | ||
| X86_FEATURE_COMPAT(AVX512BF16, "avx512bf16", 32, 36) | ||
| X86_FEATURE_COMPAT(AVX512VP2INTERSECT, "avx512vp2intersect", 33, 37) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as an idea, if we are touching all these lines, maybe align the second argument column among all entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, done
|
This change brings some ABI stability, by explicitly assigning bits to ABI_VALUE. Still the situation is not ideal because we need to make updates in compiler-rt and X86TargerParser.def. I think that a natural solution is to have compiler-rt include X86TargetParser.def and use its tables -- would that be an acceptable from compiler-rt perspective? @compnerd |
| X86_FEATURE (NF, "nf") | ||
| X86_FEATURE (CF, "cf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ordering is not needed, why we have to put some X86_FEATURE in the middle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I've separated the two
phoebewang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
e-kud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
|
I've noticed that there are some issue in compiler-rt implementation (e.g. CFLUSHOPT is not set). I plan to fix that (and other potential issue) in this PR and then merge |
|
I've pushed a last small change -- I've enabled multiversioning on x86-64, since that's what compiler-rt and libgcc expect (x86-64 being different than x86-64-v1). I will be fixing bugs in latter PRs, I don't want to include more changes here |
…ature tables (llvm#168750) Compiler-rt internal feature table is synced with the one in libgcc (common/config/i386/i386-cpuinfo.h). LLVM internal feature table is refactored to include a field ABI_VALUE, so we won't be relying on ordering to keep the values correct. The table is also synced to the one in compiler-rt.
…ature tables (llvm#168750) Compiler-rt internal feature table is synced with the one in libgcc (common/config/i386/i386-cpuinfo.h). LLVM internal feature table is refactored to include a field ABI_VALUE, so we won't be relying on ordering to keep the values correct. The table is also synced to the one in compiler-rt.
Compiler-rt internal feature table is synced with the one in libgcc (common/config/i386/i386-cpuinfo.h).
LLVM internal feature table is refactored to include a field ABI_VALUE, so we won't be relying on ordering to keep the values correct. The table is also synced to the one in compiler-rt.
I've included MICROARCH_LEVEL under FEATURE_COMPAT to simplify things -- to me knowledge behavior is the same, this distinction wasn't used for anything.