Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion clang/lib/Driver/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6440,7 +6440,8 @@ const ToolChain &Driver::getToolChain(const ArgList &Args,
TC = std::make_unique<toolchains::NVPTXToolChain>(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
TC = std::make_unique<toolchains::ROCMToolChain>(*this, Target, Args);
TC = std::make_unique<toolchains::ROCMToolChain>(*this, Target, Args,
Target.isOSWindows());
Comment on lines +6443 to +6444
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to pass something implied by the triple when we already have the triple? Also, this is for directly targeting via --target=amdgcn-amd-amdhsa, which isn't Windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we need to pass this is primarily/solely because when we create the Offloading Device ToolChain (in "Driver::getOffloadingDeviceToolChain()", we pass the Device's triple to the toolchain constructor as the target triple; which as you mentioned will not be Windows. Then the Device side toolchain, has it's own RocmInstallationDetector, which will add the linux paths. And I need a way to tell the RocmInstallationDetector, on the device side ToolChain, that the Host TC is actually Windows.

I tried a couple of different approaches to handle this case, but this is the least intrusive so far. The RocmInstallationDetector sets the search paths in its constructor, and all of the ToolChains that have a Rocm Detector call its constructor in their respective initialize list in their constructors. I need a way to tell that detector instance (for a device side toolchain) that it's Host ToolChain is windows and to not add the linux search paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to wrap my head around this more. We create multiple toolchains for offloading, the host toolchain and then the additional ones for offloading. All of those toolchains inherit from Generic_GCC in this case which initializes the detector. For the host toolchain this will just be the normal host triple, but for the GPU it should be the GPU target. Since we have the triple in the installation detector, we should just be able to change behavior if the OS is windows? We have the triple argument for ROCm we just don't use it, what's stopping you from just saving that triple inside the class and using it when we do our detection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I need to wrap my head around this more"
Yup, the class hierarchy we have created for our tool chains is ... frustrating, for this case. And I've tried a few different variations just to be able to tell the Device's ToolChain's RocmInstallationDetector to hold off on adding these search paths.

"what's stopping you from just saving that triple inside the class and using it when we do our detection"

That was one of my attempts. I first tried just adding it to the RocmInstallationDetector (RID) class. But because we use the LazyDetector template, I couldn't change the RID constructor without changing the LazyDetector template, and I didn't want to mess around with any of the other Detectors.

So, then I tried just adding a Triple member to the ROCMToolChain class for the Host Triple, and initializing it in its constructor. Which works. But it is essentially ends up being the same (or nearly the same) as just passing a flag, and doing the "work" in the ROCMToolChain constructor. If I use a member Triple for Host, I would still need to check it in the ROCMToolChain ctor, and tell its RID that the Host is Windows, before having the RID generate the search paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RID already has the host triple available I thought

RocmInstallationDetector(const Driver &D, const llvm::Triple &HostTriple,
. Is the issue the fact that we probably make one for the GPU toolchain? If it's an unrecognized triple it should probably just give up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we call it "HostTriple", but in the case for a Device side ToolChain, the triple passed to the TC's ctor (which is the same triple passed to the RID ctor) is going to be the device triple, which is always (I think) amdgcn-amd-amdhsa, or at least definitely not Windows. We can use the TC's triple for removing the linux search paths for Host side. But the problem is that this doesn't work when we create the Device TC, and it's RID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually require the ROCm installation from the device's TC? I guess it'd be used when making the device interface? The HIPAMDToolChain has a reference to the host toolchain, so I'd argue that we should use that one's ROCm installation and never use the device one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhuber6 So I spent some time trying to rework this to do just what you suggested (have the HIPAMDToolChain, just use the HostTC's RocmInstallation), because it is a good idea :-) But ... I ran into design issues. The member "HostTC" of HIPAMDToolChain, is a "const ToolChain &". And the "ToolChain" class does not have a "RocmInstallation" member, it is a member of the child class "Generic_GCC". We could cast "HostTC" reference to a Generic_GCC object reference (just in the HIPAMDToolClass), but that solution doesn't feel very pretty either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do stuff like this in a ton of places,

clang/lib/Driver/ToolChains/Cuda.cpp
387:      static_cast<const toolchains::NVPTXToolChain &>(getToolChain());
533:      static_cast<const toolchains::CudaToolChain &>(getToolChain());
584:      static_cast<const toolchains::NVPTXToolChain &>(getToolChain());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ... ok. :-)

break;
case llvm::Triple::AMDPAL:
case llvm::Triple::Mesa3D:
Expand Down
32 changes: 17 additions & 15 deletions clang/lib/Driver/ToolChains/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -306,15 +306,17 @@ RocmInstallationDetector::getInstallationPathCandidates() {
LatestVer = Ver;
}
}
if (!LatestROCm.empty())
ROCmSearchDirs.emplace_back(D.SysRoot + "/opt/" + LatestROCm,
/*StrictChecking=*/true);

ROCmSearchDirs.emplace_back(D.SysRoot + "/usr/local",
/*StrictChecking=*/true);
ROCmSearchDirs.emplace_back(D.SysRoot + "/usr",
/*StrictChecking=*/true);
if (!isHostWindows()) {
if (!LatestROCm.empty())
ROCmSearchDirs.emplace_back(D.SysRoot + "/opt/" + LatestROCm,
/*StrictChecking=*/true);

ROCmSearchDirs.emplace_back(D.SysRoot + "/usr/local",
/*StrictChecking=*/true);
ROCmSearchDirs.emplace_back(D.SysRoot + "/usr",
/*StrictChecking=*/true);
}
DoPrintROCmSearchDirs();
return ROCmSearchDirs;
}
Expand Down Expand Up @@ -375,11 +377,6 @@ RocmInstallationDetector::RocmInstallationDetector(
Twine(DefaultVersionMinor) + "." + VersionPatch)
.str();
}

if (DetectHIPRuntime)
detectHIPRuntime();
if (DetectDeviceLib)
detectDeviceLibrary();
}

void RocmInstallationDetector::detectDeviceLibrary() {
Expand Down Expand Up @@ -699,10 +696,12 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,

/// AMDGPU Toolchain
AMDGPUToolChain::AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)
const ArgList &Args, bool isHostTCMSVC)
: Generic_ELF(D, Triple, Args),
OptionsDefault(
{{options::OPT_O, "3"}, {options::OPT_cl_std_EQ, "CL1.2"}}) {
if (!isHostTCMSVC)
RocmInstallation->init();
// Check code object version options. Emit warnings for legacy options
// and errors for the last invalid code object version options.
// It is done here to avoid repeated warning or error messages for
Expand Down Expand Up @@ -835,8 +834,11 @@ bool AMDGPUToolChain::isWave64(const llvm::opt::ArgList &DriverArgs,

/// ROCM Toolchain
ROCMToolChain::ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)
: AMDGPUToolChain(D, Triple, Args) {
const ArgList &Args, bool isHostTCMSVC)
: AMDGPUToolChain(D, Triple, Args, isHostTCMSVC) {
RocmInstallation->setHostWindows(isHostTCMSVC);
if (isHostTCMSVC)
RocmInstallation->init(true, false);
RocmInstallation->detectDeviceLibrary();
}

Expand Down
4 changes: 2 additions & 2 deletions clang/lib/Driver/ToolChains/AMDGPU.h
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUToolChain : public Generic_ELF {

public:
AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args);
const llvm::opt::ArgList &Args, bool isHostTCMSVC = false);
unsigned GetDefaultDwarfVersion() const override { return 5; }

bool IsMathErrnoDefault() const override { return false; }
Expand Down Expand Up @@ -135,7 +135,7 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUToolChain : public Generic_ELF {
class LLVM_LIBRARY_VISIBILITY ROCMToolChain : public AMDGPUToolChain {
public:
ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args);
const llvm::opt::ArgList &Args, bool isHostTCMSVC);
void
addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args,
Expand Down
3 changes: 2 additions & 1 deletion clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ AMDGPUOpenMPToolChain::AMDGPUOpenMPToolChain(const Driver &D,
const llvm::Triple &Triple,
const ToolChain &HostTC,
const ArgList &Args)
: ROCMToolChain(D, Triple, Args), HostTC(HostTC) {
: ROCMToolChain(D, Triple, Args, HostTC.getTriple().isOSWindows()),
HostTC(HostTC) {
// Lookup binaries into the driver directory, this is used to
// discover the 'amdgpu-arch' executable.
getProgramPaths().push_back(getDriver().Dir);
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/Driver/ToolChains/Gnu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3060,6 +3060,9 @@ Generic_GCC::Generic_GCC(const Driver &D, const llvm::Triple &Triple,
: ToolChain(D, Triple, Args), GCCInstallation(D),
CudaInstallation(D, Triple, Args), RocmInstallation(D, Triple, Args) {
getProgramPaths().push_back(getDriver().Dir);
llvm::Triple TargetTriple(D.getTargetTriple());
if (!TargetTriple.isOSWindows())
RocmInstallation->init();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RocmInstallation should handle Windows, even if it fails, rather than just masking it here and elsewhere.

}

Generic_GCC::~Generic_GCC() {}
Expand Down
9 changes: 8 additions & 1 deletion clang/lib/Driver/ToolChains/HIPAMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,14 @@ void AMDGCN::Linker::ConstructJob(Compilation &C, const JobAction &JA,

HIPAMDToolChain::HIPAMDToolChain(const Driver &D, const llvm::Triple &Triple,
const ToolChain &HostTC, const ArgList &Args)
: ROCMToolChain(D, Triple, Args), HostTC(HostTC) {
: ROCMToolChain(D, Triple, Args, HostTC.getTriple().isOSWindows()),
HostTC(HostTC) {
if (HostTC.getTriple().isWindowsMSVCEnvironment()) {
RocmInstallation->setHostWindows(true);
}

RocmInstallation->init(true, false);

// Lookup binaries into the driver directory, this is used to
// discover the clang-offload-bundler executable.
getProgramPaths().push_back(getDriver().Dir);
Expand Down
6 changes: 6 additions & 0 deletions clang/lib/Driver/ToolChains/MSVC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,12 @@ MSVCToolChain::MSVCToolChain(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)
: ToolChain(D, Triple, Args), CudaInstallation(D, Triple, Args),
RocmInstallation(D, Triple, Args) {

// Tell the ROCm installation detector that Host is Windows before trying to
// find HIPRT or Device Libs
RocmInstallation->setHostWindows(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Toolchain constructor should not imply host

RocmInstallation->init();

getProgramPaths().push_back(getDriver().Dir);

std::optional<llvm::StringRef> VCToolsDir, VCToolsVersion;
Expand Down
12 changes: 12 additions & 0 deletions clang/lib/Driver/ToolChains/ROCm.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ class RocmInstallationDetector {
bool HasHIPStdParLibrary = false;
bool HasRocThrustLibrary = false;
bool HasRocPrimLibrary = false;
bool IsHostMSVC = false;

// Default version if not detected or specified.
const unsigned DefaultVersionMajor = 3;
Expand Down Expand Up @@ -193,6 +194,10 @@ class RocmInstallationDetector {
/// Check whether we detected a valid HIP STDPAR Acceleration library.
bool hasHIPStdParLibrary() const { return HasHIPStdParLibrary; }

/// Check whether the target triple is for Windows.
bool isHostWindows() const { return IsHostMSVC; }
void setHostWindows(bool val) { IsHostMSVC = val; }

/// Print information about the detected ROCm installation.
void print(raw_ostream &OS) const;

Expand Down Expand Up @@ -272,6 +277,13 @@ class RocmInstallationDetector {
return Loc->second;
}

void init(bool DetectHIPRuntime = true, bool DetectDeviceLib = false) {
if (DetectHIPRuntime)
detectHIPRuntime();
if (DetectDeviceLib)
detectDeviceLibrary();
}

void AddHIPIncludeArgs(const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args) const;

Expand Down
9 changes: 9 additions & 0 deletions clang/test/Driver/rocm-detect-linux.hip
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
// REQUIRES: target={{.*-linux.*}}

// Test to ensure that on Windows, we do not include linux sesrch paths
// RUN: %clang -### -nogpulib -nogpuinc \
// RUN: --print-rocm-search-dirs %s 2>&1 \
// RUN: | FileCheck %s

// CHECK: ROCm installation search path: {{/usr/local}}
// CHECK: ROCm installation search path: {{/usr}}
9 changes: 9 additions & 0 deletions clang/test/Driver/rocm-detect-windows.hip
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
// REQUIRES: target={{.*-windows.*}}

// Test to ensure that on Windows, we do not include linux sesrch paths
// RUN: %clang -### -nogpulib -nogpuinc \
// RUN: --print-rocm-search-dirs %s 2>&1 \
// RUN: | FileCheck %s

// CHECK-NOT: ROCm installation search path: {{/usr/local}}
// CHECK-NOT: ROCm installation search path: {{/usr}}
4 changes: 2 additions & 2 deletions clang/test/Driver/rocm-detect.hip
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
// RUN: %t/rocm-spack/llvm-amdgpu-4.0.0-ieagcs7inf7runpyfvepqkurasoglq4z/bin/clang -### -no-canonical-prefixes -v \
// RUN: -resource-dir=%t/rocm-spack/llvm-amdgpu-4.0.0-ieagcs7inf7runpyfvepqkurasoglq4z/lib/clang \
// RUN: -target x86_64-linux-gnu --cuda-gpu-arch=gfx900 --print-rocm-search-dirs %s 2>&1 \
// RUN: | FileCheck -check-prefixes=SPACK %s
// RUN: | FileCheck -check-prefixes=SPACK %s --dump-input always --dump-input-context 100

// Test SPACK installation with multiple hip and rocm-device-libs packages of the same
// ROCm release. --hip-path and --rocm-device-lib-path can be used to specify them.
Expand Down Expand Up @@ -147,12 +147,12 @@
// ROCM-REL: ROCm installation search path: {{.*}}/opt/rocm-3.10.0

// SPACK: InstalledDir: [[DIR:.*]]/llvm-amdgpu-4.0.0-ieagcs7inf7runpyfvepqkurasoglq4z/bin
// SPACK: Found HIP installation: [[DIR]]/hip-4.0.0-5f63slrursbrvfe2txrrjkynbsywsob5, version 4.0.20214-a2917cd
// SPACK: ROCm installation search path (Spack 4.0.0): [[DIR]]
// SPACK: ROCm installation search path: [[CLANG:.*]]
// SPACK: ROCm installation search path: [[DIR]]/llvm-amdgpu-4.0.0-ieagcs7inf7runpyfvepqkurasoglq4z
// SPACK: ROCm installation search path: [[DIR]]/llvm-amdgpu-4.0.0-ieagcs7inf7runpyfvepqkurasoglq4z/lib/clang
// SPACK: ROCm installation search path: /opt/rocm
// SPACK: Found HIP installation: [[DIR]]/hip-4.0.0-5f63slrursbrvfe2txrrjkynbsywsob5, version 4.0.20214-a2917cd
// SPACK: "-triple" "amdgcn-amd-amdhsa"
// SPACK-SAME: "-mlink-builtin-bitcode" "[[DIR]]/llvm-amdgpu-4.0.0-ieagcs7inf7runpyfvepqkurasoglq4z/amdgcn/bitcode/hip.bc"
// SPACK-SAME: "-idirafter" "[[DIR]]/hip-4.0.0-5f63slrursbrvfe2txrrjkynbsywsob5/include"
Expand Down
Loading