Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion clang/test/Driver/linker-wrapper-sycl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
// CHK-SPLIT-CMDS: "{{.*}}spirv-to-ir-wrapper" {{.*}} -o [[FIRSTLLVMLINKIN:.*]].bc --llvm-spirv-opts --spirv-preserve-auxdata --spirv-target-env=SPV-IR --spirv-builtin-format=global
// CHK-SPLIT-CMDS-NEXT: "{{.*}}llvm-link" [[FIRSTLLVMLINKIN]].bc -o [[FIRSTLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-SPLIT-CMDS-NEXT: "{{.*}}llvm-link" -only-needed [[FIRSTLLVMLINKOUT]].bc {{.*}}.bc -o [[SECONDLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-SPLIT-CMDS-NEXT: sycl-module-split: input: [[SECONDLLVMLINKOUT]].bc, output: [[SYCLMODULESPLITOUT:.*]].bc
// CHK-SPLIT-CMDS-NEXT: sycl-post-link-library: input: [[SECONDLLVMLINKOUT]].bc, output: [[SYCLMODULESPLITOUT:.*]].bc, split_mode: auto, output_assembly: false, output_prefix:
// CHK-SPLIT-CMDS-NEXT: "{{.*}}llvm-spirv"{{.*}} LLVM_SPIRV_OPTIONS -o [[SPIRVOUT:.*]].spv [[SYCLMODULESPLITOUT]].bc
// LLVM-SPIRV is not called in dry-run
// CHK-SPLIT-CMDS-NEXT: offload-wrapper: input: [[SPIRVOUT]].spv, output: [[WRAPPEROUT:.*]].bc
Expand Down
136 changes: 72 additions & 64 deletions clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -709,7 +709,7 @@ getTripleBasedSYCLPostLinkOpts(const ArgList &Args,
/// 'Args' encompasses all arguments required for linking and wrapping device
/// code and will be parsed to generate options required to be passed into the
/// sycl-post-link tool.
static Expected<std::vector<module_split::SplitModule>>
static Expected<std::vector<module_split::ProcessedModule>>
runSYCLPostLinkTool(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
Expected<std::string> SYCLPostLinkPath =
findProgram("sycl-post-link", {getMainExecutable("sycl-post-link")});
Expand Down Expand Up @@ -745,76 +745,82 @@ runSYCLPostLinkTool(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
if (!ImageFileOrErr)
return ImageFileOrErr.takeError();

std::vector Modules = {module_split::SplitModule(
std::vector Modules = {module_split::ProcessedModule(
*ImageFileOrErr, util::PropertySetRegistry(), "")};
return Modules;
}

return llvm::module_split::parseSplitModulesFromFile(*TempFileOrErr);
return llvm::module_split::parseProcessedModulesFromFile(*TempFileOrErr);
}

/// Invokes SYCL Split library for SYCL offloading.
/// Prints the message for DryRun and Verbose modes. The message contains of
/// input, output and settings.
void logSYCLLibraryInvocation(
ArrayRef<StringRef> InputFiles,
const std::vector<module_split::ProcessedModule> &Modules,
const module_split::ModuleProcessingSettings &Settings) {
auto InputFilesStr = llvm::join(InputFiles.begin(), InputFiles.end(), ",");
SmallString<128> SplitOutputFilesStr;
for (size_t I = 0, E = Modules.size(); I != E; ++I) {
if (I > 0)
SplitOutputFilesStr += ',';

SplitOutputFilesStr += Modules[I].ModuleFilePath;
}

errs() << formatv("sycl-post-link-library: input: {0}, output: {1}, {2}\n",
InputFilesStr, SplitOutputFilesStr,
convertProcessingSettingsToString(Settings));
}

/// Invokes SYCL processing library for SYCL offload finalization.
///
/// \param InputFiles the list of input LLVM IR files.
/// \param Args Encompasses all arguments for linking and wrapping device code.
/// It will be parsed to generate options required to be passed to SYCL split
/// It will be parsed to generate options required to be passed to SYCL
/// library.
/// \param Mode The splitting mode.
/// \returns The vector of split modules.
static Expected<std::vector<module_split::SplitModule>>
runSYCLSplitLibrary(ArrayRef<StringRef> InputFiles, const ArgList &Args,
module_split::IRSplitMode Mode) {
std::vector<module_split::SplitModule> SplitModules;
/// \returns The vector of processed modules.
static Expected<std::vector<module_split::ProcessedModule>>
runSYCLOffloadFinalize(ArrayRef<StringRef> InputFiles, const ArgList &Args,
module_split::IRSplitMode Mode) {
std::vector<module_split::ProcessedModule> OutputModules;
llvm::module_split::ModuleProcessingSettings Settings;
Settings.Mode = Mode;
Settings.OutputPrefix = "";
if (DryRun) {
auto OutputFileOrErr = createOutputFile(
sys::path::filename(ExecutableName) + ".sycl.split.image", "bc");
if (!OutputFileOrErr)
return OutputFileOrErr.takeError();

StringRef OutputFilePath = *OutputFileOrErr;
auto InputFilesStr = llvm::join(InputFiles.begin(), InputFiles.end(), ",");
errs() << formatv("sycl-module-split: input: {0}, output: {1}\n",
InputFilesStr, OutputFilePath);
SplitModules.emplace_back(OutputFilePath, util::PropertySetRegistry(), "");
return SplitModules;
OutputModules.emplace_back(*OutputFileOrErr, util::PropertySetRegistry(),
"");
logSYCLLibraryInvocation(InputFiles, OutputModules, Settings);
return OutputModules;
}

llvm::module_split::ModuleSplitterSettings Settings;
Settings.Mode = Mode;
Settings.OutputPrefix = "";

for (StringRef InputFile : InputFiles) {
SMDiagnostic Err;
LLVMContext C;
std::unique_ptr<Module> M = parseIRFile(InputFile, Err, C);
if (!M)
return createStringError(inconvertibleErrorCode(), Err.getMessage());

auto SplitModulesOrErr =
module_split::splitSYCLModule(std::move(M), Settings);
if (!SplitModulesOrErr)
return SplitModulesOrErr.takeError();
auto ModulesOrErr =
module_split::SYCLOffloadFinalize(std::move(M), Settings);
if (!ModulesOrErr)
return ModulesOrErr.takeError();

auto &NewSplitModules = *SplitModulesOrErr;
SplitModules.insert(SplitModules.end(), NewSplitModules.begin(),
NewSplitModules.end());
auto &NewModules = *ModulesOrErr;
OutputModules.insert(OutputModules.end(), NewModules.begin(),
NewModules.end());
}

if (Verbose) {
auto InputFilesStr = llvm::join(InputFiles.begin(), InputFiles.end(), ",");
std::string SplitOutputFilesStr;
for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
if (I > 0)
SplitOutputFilesStr += ',';

SplitOutputFilesStr += SplitModules[I].ModuleFilePath;
}

errs() << formatv("sycl-module-split: input: {0}, output: {1}\n",
InputFilesStr, SplitOutputFilesStr);
}
if (Verbose)
logSYCLLibraryInvocation(InputFiles, OutputModules, Settings);

return SplitModules;
return OutputModules;
}

/// Add any llvm-spirv option that relies on a specific Triple in addition
Expand Down Expand Up @@ -1065,7 +1071,7 @@ static Expected<StringRef> runAOTCompile(StringRef InputFile,
///
/// \returns A path to the LLVM Module that contains wrapped images.
Expected<StringRef>
wrapSYCLBinariesFromFile(std::vector<module_split::SplitModule> &SplitModules,
wrapSYCLBinariesFromFile(std::vector<module_split::ProcessedModule> &Modules,
const ArgList &Args, bool IsEmbeddedIR) {
auto OutputFileOrErr = createOutputFile(
sys::path::filename(ExecutableName) + ".sycl.image.wrapper", "bc");
Expand All @@ -1075,8 +1081,8 @@ wrapSYCLBinariesFromFile(std::vector<module_split::SplitModule> &SplitModules,
StringRef OutputFilePath = *OutputFileOrErr;
if (Verbose || DryRun) {
std::string InputFiles;
for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
InputFiles += SplitModules[I].ModuleFilePath;
for (size_t I = 0, E = Modules.size(); I != E; ++I) {
InputFiles += Modules[I].ModuleFilePath;
if (I + 1 < E)
InputFiles += ',';
}
Expand Down Expand Up @@ -1104,13 +1110,14 @@ wrapSYCLBinariesFromFile(std::vector<module_split::SplitModule> &SplitModules,
if (RegularTarget == "spirv64")
RegularTarget = "spir64";

for (auto &SI : SplitModules) {
auto MBOrDesc = MemoryBuffer::getFile(SI.ModuleFilePath);
for (auto &M : Modules) {
auto MBOrDesc = MemoryBuffer::getFile(M.ModuleFilePath);
if (!MBOrDesc)
return createFileError(SI.ModuleFilePath, MBOrDesc.getError());
return createFileError(M.ModuleFilePath, MBOrDesc.getError());

StringRef ImageTarget = IsEmbeddedIR ? StringRef(EmbeddedIRTarget) : StringRef(RegularTarget);
Images.emplace_back(std::move(*MBOrDesc), SI.Properties, SI.Symbols, ImageTarget);
Images.emplace_back(std::move(*MBOrDesc), M.Properties, M.Symbols,
ImageTarget);
}

LLVMContext C;
Expand Down Expand Up @@ -1193,9 +1200,9 @@ static Expected<StringRef> runCompile(StringRef &InputFile,

// Run wrapping library and clang
static Expected<StringRef>
runWrapperAndCompile(std::vector<module_split::SplitModule> &SplitModules,
runWrapperAndCompile(std::vector<module_split::ProcessedModule> &Modules,
const ArgList &Args, bool IsEmbeddedIR = false) {
auto OutputFile = sycl::wrapSYCLBinariesFromFile(SplitModules, Args, IsEmbeddedIR);
auto OutputFile = sycl::wrapSYCLBinariesFromFile(Modules, Args, IsEmbeddedIR);
if (!OutputFile)
return OutputFile.takeError();
// call to clang
Expand Down Expand Up @@ -2393,15 +2400,15 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
return TmpOutputOrErr.takeError();
SmallVector<StringRef> InputFilesSYCL;
InputFilesSYCL.emplace_back(*TmpOutputOrErr);
auto SplitModulesOrErr =
auto ProcessedModulesOrErr =
UseSYCLPostLinkTool
? sycl::runSYCLPostLinkTool(InputFilesSYCL, LinkerArgs)
: sycl::runSYCLSplitLibrary(InputFilesSYCL, LinkerArgs,
*SYCLModuleSplitMode);
if (!SplitModulesOrErr)
return SplitModulesOrErr.takeError();
: sycl::runSYCLOffloadFinalize(InputFilesSYCL, LinkerArgs,
*SYCLModuleSplitMode);
if (!ProcessedModulesOrErr)
return ProcessedModulesOrErr.takeError();

auto &SplitModules = *SplitModulesOrErr;
auto &ProcessedModules = *ProcessedModulesOrErr;
const llvm::Triple Triple(LinkerArgs.getLastArgValue(OPT_triple_EQ));
if ((Triple.isNVPTX() || Triple.isAMDGCN()) &&
LinkerArgs.hasArg(OPT_sycl_embed_ir)) {
Expand All @@ -2410,14 +2417,14 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
// of sycl-post-link (filetable referencing LLVM Bitcode + symbols)
// through the offload wrapper and link the resulting object to the
// application.
auto OutputFile =
sycl::runWrapperAndCompile(SplitModules, LinkerArgs, /* IsEmbeddedIR */ true);
auto OutputFile = sycl::runWrapperAndCompile(
ProcessedModules, LinkerArgs, /* IsEmbeddedIR */ true);
if (!OutputFile)
return OutputFile.takeError();
WrappedOutput.push_back(*OutputFile);
}
for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
SmallVector<StringRef> Files = {SplitModules[I].ModuleFilePath};
for (size_t I = 0, E = ProcessedModules.size(); I != E; ++I) {
SmallVector<StringRef> Files = {ProcessedModules[I].ModuleFilePath};
StringRef Arch = LinkerArgs.getLastArgValue(OPT_arch_EQ);
if (Arch.empty())
Arch = "native";
Expand All @@ -2439,21 +2446,22 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
nvptx::fatbinary(BundlerInputFiles, LinkerArgs);
if (!BundledFileOrErr)
return BundledFileOrErr.takeError();
SplitModules[I].ModuleFilePath = *BundledFileOrErr;
ProcessedModules[I].ModuleFilePath = *BundledFileOrErr;
} else if (Triple.isAMDGCN()) {
BundlerInputFiles.emplace_back(*ClangOutputOrErr, Arch);
auto BundledFileOrErr =
amdgcn::fatbinary(BundlerInputFiles, LinkerArgs);
if (!BundledFileOrErr)
return BundledFileOrErr.takeError();
SplitModules[I].ModuleFilePath = *BundledFileOrErr;
ProcessedModules[I].ModuleFilePath = *BundledFileOrErr;
} else {
SplitModules[I].ModuleFilePath = *ClangOutputOrErr;
ProcessedModules[I].ModuleFilePath = *ClangOutputOrErr;
}
}

// TODO(NOM7): Remove this call and use community flow for bundle/wrap
auto OutputFile = sycl::runWrapperAndCompile(SplitModules, LinkerArgs);
auto OutputFile =
sycl::runWrapperAndCompile(ProcessedModules, LinkerArgs);
if (!OutputFile)
return OutputFile.takeError();

Expand Down
36 changes: 22 additions & 14 deletions llvm/include/llvm/SYCLLowerIR/ModuleSplitter.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ enum IRSplitMode {
// returned.
std::optional<IRSplitMode> convertStringToSplitMode(StringRef S);

StringRef convertSplitModeToString(IRSplitMode SM);

// A vector that contains all entry point functions in a split module.
using EntryPointSet = SetVector<Function *>;

Expand Down Expand Up @@ -289,35 +291,41 @@ void dumpEntryPoints(const Module &M, bool OnlyKernelsAreEntryPoints = false,
const char *Msg = "", int Tab = 0);
#endif // NDEBUG

struct SplitModule {
struct ProcessedModule {
std::string ModuleFilePath;
util::PropertySetRegistry Properties;
std::string Symbols;

SplitModule() = default;
SplitModule(const SplitModule &) = default;
SplitModule &operator=(const SplitModule &) = default;
SplitModule(SplitModule &&) = default;
SplitModule &operator=(SplitModule &&) = default;
ProcessedModule() = default;
ProcessedModule(const ProcessedModule &) = default;
ProcessedModule &operator=(const ProcessedModule &) = default;
ProcessedModule(ProcessedModule &&) = default;
ProcessedModule &operator=(ProcessedModule &&) = default;

SplitModule(std::string_view File, util::PropertySetRegistry Properties,
std::string Symbols)
ProcessedModule(std::string_view File, util::PropertySetRegistry Properties,
std::string Symbols)
: ModuleFilePath(File), Properties(std::move(Properties)),
Symbols(std::move(Symbols)) {}
};

struct ModuleSplitterSettings {
struct ModuleProcessingSettings {
IRSplitMode Mode;
bool OutputAssembly = false; // Bitcode or LLVM IR.
StringRef OutputPrefix;
};

/// Parses the output table file from sycl-post-link tool.
Expected<std::vector<SplitModule>> parseSplitModulesFromFile(StringRef File);
SmallString<64>
convertProcessingSettingsToString(const ModuleProcessingSettings &S);

/// Splits the given module \p M according to the given \p Settings.
Expected<std::vector<SplitModule>>
splitSYCLModule(std::unique_ptr<Module> M, ModuleSplitterSettings Settings);
/// Parses the output table file from sycl-post-link tool.
Expected<std::vector<ProcessedModule>>
parseProcessedModulesFromFile(StringRef File);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we wanted to seperate module splitting out into a seperate operation, possibly so that we could hook it into any upstream splitting api, but it seems like here we are generalizing these APIs, at least in the name (also, the file name still has split it in which is a little confusing)

what's the long term plan/architecture?

thanks

Copy link
Contributor Author

@maksimsab maksimsab Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following processing pipeline comes only from perspectives of RAM usage and compilation time.
Get new split module -> handle spec constants -> handle ESIMD part -> save bc files to disk -> run SPIRVTranslator.

It is known that keeping split modules in the RAM is not really possible. It leads to the question where we should perform the spec const and ESIMD parts? Before saving to disk or after?

We could do the following scheme:
Split modules -> save bc files to disk -> read them again, perform spec const, ESIMD processing -> save bc files to disk again -> run SPIRVTranslator.
This scheme entails twice disk ops compared to the previous one.

The original scheme takes its roots from the usage of SPIRVTranslator. In case of usage of SPIRV backend it would be much easier like the following:
Split modules and save bc files -> read bc file, perform spec const, ESIMD processing -> run SPIRV backend -> save spirv file to disk.
I expect to move to this scheme once SPIRV backend becomes the main path. However, I don't know when it will happen.

I see your concerns that you need some API for thin-LTO. I was considering to come up with a splitting Pass similar to one that AMD have in llvm-project.

While SPIRVTranslator is the main tool you still need to handle spec constants and ESIMD somewhere in thin-LTO. Most likely, post-split processing part might be extracted in order to be invoked from the library and thin-LTO framework. I didn't think about it much because, first of all, I was focused on moving spec constants to the library.

Copy link
Contributor

@sarnex sarnex Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation. actually for thin-lto we call sycl-post-link early (in -c phase) and the processing happens as usual there. running the processing early seems to be enough for everything besides spec constants (that we know of right now), so that one i will do as part of device link, so i don't think there is a need for an API to call all sycl-post-link post-split processing for thinlto, at least not at the moment


/// Performs the offload finale processing of the given module \p M according
/// to the given \p Settings.
Expected<std::vector<ProcessedModule>>
SYCLOffloadFinalize(std::unique_ptr<Module> M,
ModuleProcessingSettings Settings);

bool isESIMDFunction(const Function &F);
bool canBeImportedFunction(const Function &F);
Expand Down
Loading
Loading