-
Notifications
You must be signed in to change notification settings - Fork 15k
[Offload] Add device UID #164391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Offload] Add device UID #164391
Changes from all commits
1e3bb90
7f8d296
957ca31
29e5560
7c753ef
34dd9dd
04ba6a5
30a3437
fff258a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -791,6 +791,9 @@ struct GenericDeviceTy : public DeviceAllocatorTy { | |
| /// this id is not unique between different plugins; they may overlap. | ||
| int32_t getDeviceId() const { return DeviceId; } | ||
|
|
||
| /// Get the unique identifier of the device. | ||
| const char *getDeviceUid() const { return DeviceUid.c_str(); } | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is DeviceUid stored as a field in GenericDeviceTy? liboffload already stores the info tree (which contains copies of all the strings) in the device struct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but the info tree needs to be populated, which I do in obtainInfo() for the UID. And doesn't it make more sense to set the UID on creation of the object and not later while calling some method? |
||
|
|
||
| /// Set the context of the device if needed, before calling device-specific | ||
| /// functions. Plugins may implement this function as a no-op if not needed. | ||
| virtual Error setContext() = 0; | ||
|
|
@@ -989,9 +992,12 @@ struct GenericDeviceTy : public DeviceAllocatorTy { | |
| Error syncEvent(void *EventPtr); | ||
| virtual Error syncEventImpl(void *EventPtr) = 0; | ||
|
|
||
| /// Obtain information about the device. | ||
| Expected<InfoTreeNode> obtainInfo(); | ||
| virtual Expected<InfoTreeNode> obtainInfoImpl() = 0; | ||
|
|
||
| /// Print information about the device. | ||
| Error printInfo(); | ||
| virtual Expected<InfoTreeNode> obtainInfoImpl() = 0; | ||
|
|
||
| /// Return true if the device has work that is either queued or currently | ||
| /// running | ||
|
|
@@ -1204,6 +1210,14 @@ struct GenericDeviceTy : public DeviceAllocatorTy { | |
| /// global device id and is not the device id visible to the OpenMP user. | ||
| const int32_t DeviceId; | ||
|
|
||
| /// The unique identifier of the device. | ||
| /// Per default, the unique identifier of the device is set to the device id, | ||
| /// combined with the plugin name, since the offload device id may overlap | ||
| /// between different plugins. | ||
| std::string DeviceUid; | ||
| /// Construct the device UID from the vendor (U)UID. | ||
| void setDeviceUidFromVendorUid(StringRef VendorUid); | ||
|
Comment on lines
+1217
to
+1219
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need a global for this? Figured we can just recompute it whenever we call this function, not like it's on a fast path. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, but who would own the string pointed to by the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The UUID is supposed to be a constant property of the device. Why do we need to set it and we can just return a std::string. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just added this set function so that it can be consistently called from the initImpl method of the plugins. Ok, I can also return std::string and thus remove the attribute There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
But then, I have to do the hsa/cuda call every time. That doesn't seem appropriate? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
(and it's protected) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need to call it in the initimpl, do we need to use this before the devices are loaded? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean, I get the UUID in the initImpl, because that's where all the other info-retrieving calls to the vendor libs are. |
||
|
|
||
| /// The default grid values used for this device. | ||
| llvm::omp::GV GridValues; | ||
|
|
||
|
|
@@ -1280,6 +1294,9 @@ struct GenericPluginTy { | |
|
|
||
| return *Devices[DeviceId]; | ||
| } | ||
| const GenericDeviceTy &getDevice(int32_t DeviceId) const { | ||
| return const_cast<GenericPluginTy *>(this)->getDevice(DeviceId); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer to repeat the implementation as above (which are three lines) than using the const cast. |
||
| } | ||
|
|
||
| /// Get the number of active devices. | ||
| int32_t getNumDevices() const { return NumDevices; } | ||
|
|
@@ -1290,6 +1307,14 @@ struct GenericPluginTy { | |
| return UserDeviceIds.at(DeviceId); | ||
| } | ||
|
|
||
| /// Get the UID for the given device. | ||
| const char *getDeviceUid(int32_t DeviceId) const { | ||
| return getDevice(DeviceId).getDeviceUid(); | ||
| } | ||
|
|
||
| /// Get the UID for the host device. | ||
| static constexpr const char *getHostDeviceUid() { return "HOST"; } | ||
|
|
||
| /// Get the ELF code to recognize the binary image of this plugin. | ||
| virtual uint16_t getMagicElfBits() const = 0; | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -715,6 +715,9 @@ GenericDeviceTy::GenericDeviceTy(GenericPluginTy &Plugin, int32_t DeviceId, | |
| DeviceId(DeviceId), GridValues(OMPGridValues), | ||
| PeerAccesses(NumDevices, PeerAccessState::PENDING), PeerAccessesLock(), | ||
| PinnedAllocs(*this), RPCServer(nullptr) { | ||
| DeviceUid = std::string(Plugin.getName()) + "-" + | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This does not seem like a UUID, what does OpenMP say about this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That it's not a uuid as in "universal" uid (that's why I only call it "uid", too). Specifically, OpenMP 6.0, Section 9.1 says:
So it's only an identifier unique to a device on a given system. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well we should probably use UUIDs anyway if possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do. This is just the default implementation. It is overwritten in the amdgpu rtl, for example. |
||
| std::to_string(static_cast<uint64_t>(DeviceId)); | ||
|
|
||
| #ifdef OMPT_SUPPORT | ||
| OmptInitialized.store(false); | ||
| // Bind the callbacks to this device's member functions | ||
|
|
@@ -1524,15 +1527,22 @@ Error GenericDeviceTy::enqueueHostCall(void (*Callback)(void *), void *UserData, | |
| return Err; | ||
| } | ||
|
|
||
| Expected<InfoTreeNode> GenericDeviceTy::obtainInfo() { | ||
| auto InfoOrErr = obtainInfoImpl(); | ||
| if (InfoOrErr) | ||
| InfoOrErr->add("UID", getDeviceUid(), "", DeviceInfo::UID); | ||
| return InfoOrErr; | ||
| } | ||
|
|
||
| Error GenericDeviceTy::printInfo() { | ||
| auto Info = obtainInfoImpl(); | ||
| auto InfoOrErr = obtainInfo(); | ||
|
|
||
| // Get the vendor-specific info entries describing the device properties. | ||
| if (auto Err = Info.takeError()) | ||
| if (auto Err = InfoOrErr.takeError()) | ||
| return Err; | ||
|
|
||
| // Print all info entries. | ||
| Info->print(); | ||
| InfoOrErr->print(); | ||
|
|
||
| return Plugin::success(); | ||
| } | ||
|
|
@@ -1603,6 +1613,10 @@ Expected<bool> GenericDeviceTy::isAccessiblePtr(const void *Ptr, size_t Size) { | |
| return isAccessiblePtrImpl(Ptr, Size); | ||
| } | ||
|
|
||
| void GenericDeviceTy::setDeviceUidFromVendorUid(StringRef VendorUid) { | ||
| DeviceUid = std::string(Plugin.getName()) + "-" + std::string(VendorUid); | ||
| } | ||
|
|
||
| Error GenericPluginTy::init() { | ||
| if (Initialized) | ||
| return Plugin::success(); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably call this UUID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it's not guaranteed to be a UUID (and doesn't need to be).
I don't think that all vendors can agree on UUIDs for their stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a UUID, as it's not necessarily what is understood with it. UUID is a 128-bit identifier, while the UID is vendor-specific string. E.g., ROCm has some thingy with
GPU-and a hex number.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's bothering me a little bit, btw, is that my formatting of the Nvidia (U)UID is different from the formatting in
nvidia-smi. But I don't know if it's smart to try to copy that format without a clear documentation of it. So ig it's better to just be consistent and to tell people to look up the UIDs withllvm-offload-device-info