-
Couldn't load subscription status.
- Fork 15k
[RISCV] Support __builtin_cpu_init and __builtin_cpu_supports #99700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This implements the __builtin_cpu_init and __builtin_cpu_supports builtin routines based on the compiler runtime changes in llvm#85790. This is inspired by llvm#85786. Major changes are a) a restriction in scope to only the builtins (which have a much narrower user interface), and the avoidance of false generality. This change deliberately only handles group 0 extensions (which happen to be all defined ones today), and avoids the tblgen changes from that review. This is still a WIP. It is posted for initial feedback on whether this makes sense to try to get into 19.x release. Major items left undone: * Updating clang tests to exercise this logic. * Actually running it at all. I did not build compiler-rt, and thus all my checking was of generated asm/IR. * Investigate claims from gcc docs that __builtin_cpu_init is called early in process lifetime with high priority constructor. I did not find this with some quick searching.
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Philip Reames (preames) ChangesThis implements the __builtin_cpu_init and __builtin_cpu_supports builtin routines based on the compiler runtime changes in #85790. This is inspired by #85786. Major changes are a) a restriction in scope to only the builtins (which have a much narrower user interface), and the avoidance of false generality. This change deliberately only handles group 0 extensions (which happen to be all defined ones today), and avoids the tblgen changes from that review. This is still a WIP. It is posted for initial feedback on whether this makes sense to try to get into 19.x release. Major items left undone:
Full diff: https://github.com/llvm/llvm-project/pull/99700.diff 6 Files Affected:
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 9159162f01d1b..41d836330b38c 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -479,3 +479,9 @@ RISCVTargetInfo::checkCallingConvention(CallingConv CC) const {
return CCCR_OK;
}
}
+
+bool RISCVTargetInfo::validateCpuSupports(StringRef Feature) const {
+ // Only allow extensions we have a known bit position for in the
+ // __riscv_feature_bits structure.
+ return -1 != llvm::RISCVISAInfo::getRISCVFeaturesBitPosition(Feature);
+}
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index d5df6344bedc0..626274b8fc437 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -126,6 +126,10 @@ class RISCVTargetInfo : public TargetInfo {
std::pair<unsigned, unsigned> hardwareInterferenceSizes() const override {
return std::make_pair(32, 32);
}
+
+ bool supportsCpuSupports() const override { return getTriple().isOSLinux(); }
+ bool supportsCpuInit() const override { return getTriple().isOSLinux(); }
+ bool validateCpuSupports(StringRef Feature) const override;
};
class LLVM_LIBRARY_VISIBILITY RISCV32TargetInfo : public RISCVTargetInfo {
public:
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2ad62d6ee0bb2..71c947776adf2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -62,6 +62,8 @@
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/ScopedPrinter.h"
#include "llvm/TargetParser/AArch64TargetParser.h"
+#include "llvm/TargetParser/RISCVISAInfo.h"
+#include "llvm/TargetParser/RISCVTargetParser.h"
#include "llvm/TargetParser/X86TargetParser.h"
#include <optional>
#include <sstream>
@@ -14215,6 +14217,16 @@ Value *CodeGenFunction::EmitAArch64CpuInit() {
return Builder.CreateCall(Func);
}
+Value *CodeGenFunction::EmitRISCVCpuInit() {
+ llvm::FunctionType *FTy = llvm::FunctionType::get(VoidTy, false);
+ llvm::FunctionCallee Func =
+ CGM.CreateRuntimeFunction(FTy, "__init_riscv_feature_bits");
+ auto *CalleeGV = cast<llvm::GlobalValue>(Func.getCallee());
+ CalleeGV->setDSOLocal(true);
+ CalleeGV->setDLLStorageClass(llvm::GlobalValue::DefaultStorageClass);
+ return Builder.CreateCall(Func);
+}
+
Value *CodeGenFunction::EmitX86CpuInit() {
llvm::FunctionType *FTy = llvm::FunctionType::get(VoidTy,
/*Variadic*/ false);
@@ -14267,6 +14279,43 @@ CodeGenFunction::EmitAArch64CpuSupports(ArrayRef<StringRef> FeaturesStrs) {
return Result;
}
+Value *CodeGenFunction::EmitRISCVCpuSupports(const CallExpr *E) {
+
+ const Expr *FeatureExpr = E->getArg(0)->IgnoreParenCasts();
+ StringRef FeatureStr = cast<StringLiteral>(FeatureExpr)->getString();
+ if (!getContext().getTargetInfo().validateCpuSupports(FeatureStr))
+ return Builder.getFalse();
+
+ // Note: We are making an unchecked assumption that the size of the
+ // feature array is >= 1. This holds for any version of compiler-rt
+ // which defines this interface.
+ llvm::ArrayType *ArrayOfInt64Ty =
+ llvm::ArrayType::get(Int64Ty, 1);
+ llvm::Type *StructTy = llvm::StructType::get(Int32Ty, ArrayOfInt64Ty);
+ llvm::Constant *RISCVFeaturesBits =
+ CGM.CreateRuntimeVariable(StructTy, "__riscv_feature_bits");
+ auto *GV = cast<llvm::GlobalValue>(RISCVFeaturesBits);
+ GV->setDSOLocal(true);
+
+ auto LoadFeatureBit = [&](unsigned Index) {
+ // Create GEP then load.
+ Value *IndexVal = llvm::ConstantInt::get(Int32Ty, Index);
+ llvm::Value *GEPIndices[] = {Builder.getInt32(0), Builder.getInt32(1),
+ IndexVal};
+ Value *Ptr =
+ Builder.CreateInBoundsGEP(StructTy, RISCVFeaturesBits, GEPIndices);
+ Value *FeaturesBit =
+ Builder.CreateAlignedLoad(Int64Ty, Ptr, CharUnits::fromQuantity(8));
+ return FeaturesBit;
+ };
+
+ int BitPos = RISCVISAInfo::getRISCVFeaturesBitPosition(FeatureStr);
+ assert(BitPos != -1 && "validation should have rejected this feature");
+ Value *MaskV = Builder.getInt64(1ULL << BitPos);
+ Value *Bitset = Builder.CreateAnd(LoadFeatureBit(0), MaskV);
+ return Builder.CreateICmpEQ(Bitset, MaskV);
+}
+
Value *CodeGenFunction::EmitX86BuiltinExpr(unsigned BuiltinID,
const CallExpr *E) {
if (BuiltinID == Builtin::BI__builtin_cpu_is)
@@ -21728,6 +21777,12 @@ Value *CodeGenFunction::EmitHexagonBuiltinExpr(unsigned BuiltinID,
Value *CodeGenFunction::EmitRISCVBuiltinExpr(unsigned BuiltinID,
const CallExpr *E,
ReturnValueSlot ReturnValue) {
+
+ if (BuiltinID == Builtin::BI__builtin_cpu_supports)
+ return EmitRISCVCpuSupports(E);
+ if (BuiltinID == Builtin::BI__builtin_cpu_init)
+ return EmitRISCVCpuInit();
+
SmallVector<Value *, 4> Ops;
llvm::Type *ResultType = ConvertType(E->getType());
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index d83e38cab8e2d..1ee45a1f1f99d 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -4695,6 +4695,9 @@ class CodeGenFunction : public CodeGenTypeCache {
llvm::Value *EmitRISCVBuiltinExpr(unsigned BuiltinID, const CallExpr *E,
ReturnValueSlot ReturnValue);
+ llvm::Value *EmitRISCVCpuSupports(const CallExpr *E);
+ llvm::Value *EmitRISCVCpuInit();
+
void AddAMDGPUFenceAddressSpaceMMRA(llvm::Instruction *Inst,
const CallExpr *E);
void ProcessOrderScopeAMDGCN(llvm::Value *Order, llvm::Value *Scope,
diff --git a/llvm/include/llvm/TargetParser/RISCVISAInfo.h b/llvm/include/llvm/TargetParser/RISCVISAInfo.h
index d7a08013fa6ac..d71ff174bf0d2 100644
--- a/llvm/include/llvm/TargetParser/RISCVISAInfo.h
+++ b/llvm/include/llvm/TargetParser/RISCVISAInfo.h
@@ -80,6 +80,10 @@ class RISCVISAInfo {
std::set<StringRef> &EnabledFeatureNames,
StringMap<StringRef> &DescMap);
+ /// Return the bit position (in group 0) of __riscv_feature_bits. Returns
+ /// -1 if not supported.
+ static int getRISCVFeaturesBitPosition(StringRef Ext);
+
private:
RISCVISAInfo(unsigned XLen) : XLen(XLen) {}
diff --git a/llvm/lib/TargetParser/RISCVISAInfo.cpp b/llvm/lib/TargetParser/RISCVISAInfo.cpp
index e6ddbb4dc28d5..3ff1b325793e6 100644
--- a/llvm/lib/TargetParser/RISCVISAInfo.cpp
+++ b/llvm/lib/TargetParser/RISCVISAInfo.cpp
@@ -1020,3 +1020,64 @@ std::string RISCVISAInfo::getTargetFeatureForExtension(StringRef Ext) {
return isExperimentalExtension(Name) ? "experimental-" + Name.str()
: Name.str();
}
+
+struct RISCVExtBit {
+ const StringRef ext;
+ uint64_t bitpos;
+};
+
+/// Maps extensions with assigned bit positions within group 0 of
+/// __riscv_features_bits to their respective bit position. At the
+/// moment all extensions are within group 0.
+static RISCVExtBit RISCVGroup0BitPositions[] = {
+ {"a", 0},
+ {"c", 2},
+ {"d", 3},
+ {"f", 5},
+ {"i", 8},
+ {"m", 12},
+ {"v", 21},
+ {"zacas", 26},
+ {"zba", 27},
+ {"zbb", 28},
+ {"zbc", 29},
+ {"zbkb", 30},
+ {"zbkc", 31},
+ {"zbkx", 32},
+ {"zbs", 33},
+ {"zfa", 34},
+ {"zfh", 35},
+ {"zfhmin", 36},
+ {"zicboz", 37},
+ {"zicond", 38},
+ {"zihintntl", 39},
+ {"zihintpause", 40},
+ {"zknd", 41},
+ {"zkne", 42},
+ {"zknh", 43},
+ {"zksed", 44},
+ {"zksh", 45},
+ {"zkt", 46},
+ {"ztso", 47},
+ {"zvbb", 48},
+ {"zvbc", 49},
+ {"zvfh", 50},
+ {"zvfhmin", 51},
+ {"zvkb", 52},
+ {"zvkg", 53},
+ {"zvkned", 54},
+ {"zvknha", 55},
+ {"zvknhb", 56},
+ {"zvksed", 57},
+ {"zvksh", 58},
+ {"zvkt", 59}
+};
+int RISCVISAInfo::getRISCVFeaturesBitPosition(StringRef Ext) {
+ // Note that this code currently accepts mixed case extension names, but
+ // does not handle extension versions at all. That's probably fine because
+ // there's only one extension version in the __riscv_feature_bits vector.
+ for (auto E : RISCVGroup0BitPositions)
+ if (E.ext.equals_insensitive(Ext))
+ return E.bitpos;
+ return -1;
+}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
|
||
| struct RISCVExtBit { | ||
| const StringRef ext; | ||
| uint64_t bitpos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need uint64_t for bit position? I think the value doesn't exceed 63.
clang/lib/CodeGen/CGBuiltin.cpp
Outdated
| #include "llvm/Support/ScopedPrinter.h" | ||
| #include "llvm/TargetParser/AArch64TargetParser.h" | ||
| #include "llvm/TargetParser/RISCVISAInfo.h" | ||
| #include "llvm/TargetParser/RISCVTargetParser.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RISCVTargetParser.h can be dropped due to its come from tablegen stuff.
Maybe we could compile the test code with Actually, I tried this patch in my local, and it work well. Here is build script for compiler-rt, if you want to build compiler-rt.
|
| /// Maps extensions with assigned bit positions within group 0 of | ||
| /// __riscv_features_bits to their respective bit position. At the | ||
| /// moment all extensions are within group 0. | ||
| static RISCVExtBit RISCVGroup0BitPositions[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason we choose the tablegen approach is trying to maintain the everything relate to extension (In this case is groupID/bitPos) inside RISCVFeatures.td.
How about land #94440 now, and replace this struct with tablegen generated .inc file? Or maybe we could replace it after 19.x release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, adding the additional information to RISCVFeatures.td just complicates things.
| } | ||
|
|
||
| struct RISCVExtBit { | ||
| const StringRef ext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use StringLiteral.
| /// Maps extensions with assigned bit positions within group 0 of | ||
| /// __riscv_features_bits to their respective bit position. At the | ||
| /// moment all extensions are within group 0. | ||
| static RISCVExtBit RISCVGroup0BitPositions[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this constexpr to make sure the strlen for the strings actually happens at compile time.
| #endif | ||
|
|
||
| #if !__has_builtin(__builtin_cpu_supports) | ||
| # if defined(X86) || defined(RISCV) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ARM support __builtin_cpu_supports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Work fine with current compiler-rt __init_riscv_feature_bits /__riscv_feature_bits and qemu with hwprobe support. (call __builtin_cpu_init manually situation)
|
I've gone ahead and merged this into main. We have missed the branch creation, so without further action this will not be included in 19.x. We need to ensure the constructor change for compiler-rt lands, and then backport them together if we choose to. |
This implements the __builtin_cpu_init and __builtin_cpu_supports builtin routines based on the compiler runtime changes in #85790. This is inspired by #85786. Major changes are a) a restriction in scope to only the builtins (which have a much narrower user interface), and the avoidance of false generality. This change deliberately only handles group 0 extensions (which happen to be all defined ones today), and avoids the tblgen changes from that review. I don't have an environment in which I can actually test this, but @BeMg has been kind enough to report that this appears to work as expected. Before this can make it into a release, we need a change such as #99958. The gcc docs claim that cpu_support can be called by "normal" code without calling the cpu_init routine because the init routine will have been called by a high priority constructor. Our current compiler-rt mechanism does not do this.
This implements the __builtin_cpu_init and __builtin_cpu_supports builtin routines based on the compiler runtime changes in #85790.
This is inspired by #85786. Major changes are a) a restriction in scope to only the builtins (which have a much narrower user interface), and the avoidance of false generality. This change deliberately only handles group 0 extensions (which happen to be all defined ones today), and avoids the tblgen changes from that review.
I don't have an environment in which I can actually test this, but @BeMg has been kind enough to report that this appears to work as expected.
Before this can make it into a release, we need a change such as #99958. The gcc docs claim that cpu_support can be called by "normal" code without calling the cpu_init routine because the init routine will have been called by a high priority constructor. Our current compiler-rt mechanism does not do this.