-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[ELF] handle new NVIDIA GPU variants. #151604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-binary-utilities Author: Artem Belevich (Artem-B) ChangesFull diff: https://github.com/llvm/llvm-project/pull/151604.diff 3 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index ad35d7f05d5da..749971e354f66 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -973,7 +973,10 @@ enum : unsigned {
// SM based processor values.
EF_CUDA_SM100 = 0x6400,
+ EF_CUDA_SM101 = 0x6500,
+ EF_CUDA_SM103 = 0x6700,
EF_CUDA_SM120 = 0x7800,
+ EF_CUDA_SM121 = 0x7900,
// Set when using an accelerator variant like sm_100a.
EF_CUDA_ACCELERATORS = 0x8,
diff --git a/llvm/lib/Object/ELFObjectFile.cpp b/llvm/lib/Object/ELFObjectFile.cpp
index 0919c6aad74f2..aff047c297cc2 100644
--- a/llvm/lib/Object/ELFObjectFile.cpp
+++ b/llvm/lib/Object/ELFObjectFile.cpp
@@ -688,11 +688,20 @@ StringRef ELFObjectFileBase::getNVPTXCPUName() const {
case ELF::EF_CUDA_SM100:
return getPlatformFlags() & ELF::EF_CUDA_ACCELERATORS ? "sm_100a"
: "sm_100";
+ case ELF::EF_CUDA_SM101:
+ return getPlatformFlags() & ELF::EF_CUDA_ACCELERATORS ? "sm_101a"
+ : "sm_101";
+ case ELF::EF_CUDA_SM103:
+ return getPlatformFlags() & ELF::EF_CUDA_ACCELERATORS ? "sm_103a"
+ : "sm_103";
// Rubin architecture.
case ELF::EF_CUDA_SM120:
return getPlatformFlags() & ELF::EF_CUDA_ACCELERATORS ? "sm_120a"
: "sm_120";
+ case ELF::EF_CUDA_SM121:
+ return getPlatformFlags() & ELF::EF_CUDA_ACCELERATORS ? "sm_121a"
+ : "sm_121";
default:
llvm_unreachable("Unknown EF_CUDA_SM value");
}
diff --git a/llvm/tools/llvm-readobj/ELFDumper.cpp b/llvm/tools/llvm-readobj/ELFDumper.cpp
index 94ce38605f5c9..1321d594416c5 100644
--- a/llvm/tools/llvm-readobj/ELFDumper.cpp
+++ b/llvm/tools/llvm-readobj/ELFDumper.cpp
@@ -1683,7 +1683,9 @@ const EnumEntry<unsigned> ElfHeaderNVPTXFlags[] = {
ENUM_ENT(EF_CUDA_SM75, "sm_75"), ENUM_ENT(EF_CUDA_SM80, "sm_80"),
ENUM_ENT(EF_CUDA_SM86, "sm_86"), ENUM_ENT(EF_CUDA_SM87, "sm_87"),
ENUM_ENT(EF_CUDA_SM89, "sm_89"), ENUM_ENT(EF_CUDA_SM90, "sm_90"),
- ENUM_ENT(EF_CUDA_SM100, "sm_100"), ENUM_ENT(EF_CUDA_SM120, "sm_120"),
+ ENUM_ENT(EF_CUDA_SM100, "sm_100"), ENUM_ENT(EF_CUDA_SM101, "sm_101"),
+ ENUM_ENT(EF_CUDA_SM103, "sm_103"), ENUM_ENT(EF_CUDA_SM120, "sm_120"),
+ ENUM_ENT(EF_CUDA_SM121, "sm_121"),
};
const EnumEntry<unsigned> ElfHeaderRISCVFlags[] = {
|
jhuber6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks.
|
@jhuber6 The patch does not work correctly. It appears that those |
|
Hm, that's surprising, it's something in |
jhuber6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I might've left a bug that accidentally worked.
I think this here needs to use the correct bitmask
| unsigned(ELF::EF_CUDA_SM)); |
Yup. That was it. Fixed now. |
|
@jhuber6 Fun fact: the Either that, or I'm missing something here. @AlexMaclean would you happen to have an idea whether the cubin compiled for the Is the idea that 'sm_101f' allows (subset of) instructions from |
I've learned to stop questioning things when it comes to NVIDIA's binary decisions. Add it to such gems like the PTX |
|
To be fair, it heavily depends on particular team at NVIDIA. Anecdotally, the more exposed to the open source (or other kinds of external influence) their team is, the better things tend to work. NVIDIA's teams working on NVPTX, CCCL, and OpenXLA are great to work with. Oddities tend to surface in NVIDIA's black boxes that they never intended for external tinkering (e.g. nvcc front-end, binary tools, some binary-only libraries). The problem is that there's usually no good communication channel to the owners of the components with those issues -- there's still no public bug tracker of any kind for CUDA SDK components. |
|
I've forwarded the question along internally but this isn't an area I have much familiarity with. |
No description provided.