Skip to content

Conversation

@maksimsab
Copy link
Contributor

Post Link compilation flow consists of Module splitting, specialization constants processing and ESIMD processing. To correspond to the all functionalities this patch adds renames in order to be less confusing.

The following changes are added:

  • ModuleSplitProcessingSettings structure is renamed to ModuleProcessingSettings.
  • Added function convertProcessingSettingsToString to add additional piece of information in the testing.
  • SplitModule structure is renamed to ProcessedModule.
  • runSYCLSplitLibrary is renamed to runSYCLPostLinkLibrary.
  • DryRun mode in runSYCLPostLinkLibrary is combined with Verbose mode in order to remove code duplication.
  • splitSYCLModule is renamed to SYCLPostLinkProcess.

Post Link compilation flow consists of Module spliting, specializaition
constants processing and ESIMD processing. To correspond to the
all functionalities this patch adds renames in order to be less
confusing.

The following changes are added:
* ModuleSplitProcessingSettings structure is renamed to ModuleProcessingSettings.
* Added function convertProcessingSettingsToString to add additional
  piece of information in the testing.
* SplitModule structure is renamed to ProcessedModule.
* runSYCLSplitLibrary is renamed to runSYCLPostLinkLibrary.
* DryRun mode in runSYCLPostLinkLibrary is combined with Verbose mode
  in order to remove code duplication.
* splitSYCLModule is renamed to SYCLPostLinkProcess.
@maksimsab maksimsab requested review from a team as code owners September 11, 2024 16:47
@maksimsab
Copy link
Contributor Author

My patch #15133 has become so big locally that I decided to split it in several PRs in order to decrease a burden on reviewers.

}

/// Invokes SYCL Split library for SYCL offloading.
/// Invokes SYCL Post Link library for SYCL offloading.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "Post Link" is a good choice of wording here. This name reflects some requirements for using this library in compilation process, but I think they are false in general.
For example, in no-rdc mode, there is no "link" step as the code is self-contained by definition.

I think something like "SYCL offload finalization library" better reflects the purpose of this code.

Same applies to SYCLPostLinkProcess function from llvm/include/llvm/SYCLLowerIR/ModuleSplitter.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCLOffloadFinalize ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, in no-rdc mode this processing can be done in -c phase because they are already implemented as transformation passes. In that case, there is no need to invoke this library at all in clang-linker-wrapper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly ping, @bader

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly pong.

SYCLOffloadFinalize ?

Could you clarify what are you trying to ask here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that SYCLPostLinkProcess is better to be replaced by SYCLOffloadFinalize?

What do you think about my comment regarding no-rdc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that SYCLPostLinkProcess is better to be replaced by SYCLOffloadFinalize?

SYCLOffloadFinalize doesn't seem to fit perfectly, but it sounds better than SYCLPostLinkProcess in my opinion.

What do you think about my comment regarding no-rdc?

I agree, but I don't get the motivation to have a separate "library". Correct me if I wrong, but all this library does is running LLVM passes. My understanding is that we can integrate these LLVM passes either into standard optimization pipeline (preferred) or by building a separate optimization pipeline. @sarnex is using this kind of approach in https://github.com/intel/llvm/pull/15083/files#diff-f7dc51acede945c7c8481147e773e9c09230a93b937bee9eab08ee52826cfa8eR1779.

Basically, I would consider integrating new passes into LLVM optimization pipeline for both rdc and no-rdc modes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module splitting can't be integrated into -c phase even if it were be implemented as a Pass.

Eventually, I see the compilation pipeline as thin-LTO + optimization pipeline. But I don't see that it happens soon. The plan is to move sycl-post-link functionality in clang-linker-wrapper. Then move to the thin-LTO once it is ready.

splitSYCLModule(std::unique_ptr<Module> M, ModuleSplitterSettings Settings);
/// Parses the output table file from sycl-post-link tool.
Expected<std::vector<ProcessedModule>>
parseProcessedModulesFromFile(StringRef File);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we wanted to seperate module splitting out into a seperate operation, possibly so that we could hook it into any upstream splitting api, but it seems like here we are generalizing these APIs, at least in the name (also, the file name still has split it in which is a little confusing)

what's the long term plan/architecture?

thanks

Copy link
Contributor Author

@maksimsab maksimsab Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following processing pipeline comes only from perspectives of RAM usage and compilation time.
Get new split module -> handle spec constants -> handle ESIMD part -> save bc files to disk -> run SPIRVTranslator.

It is known that keeping split modules in the RAM is not really possible. It leads to the question where we should perform the spec const and ESIMD parts? Before saving to disk or after?

We could do the following scheme:
Split modules -> save bc files to disk -> read them again, perform spec const, ESIMD processing -> save bc files to disk again -> run SPIRVTranslator.
This scheme entails twice disk ops compared to the previous one.

The original scheme takes its roots from the usage of SPIRVTranslator. In case of usage of SPIRV backend it would be much easier like the following:
Split modules and save bc files -> read bc file, perform spec const, ESIMD processing -> run SPIRV backend -> save spirv file to disk.
I expect to move to this scheme once SPIRV backend becomes the main path. However, I don't know when it will happen.

I see your concerns that you need some API for thin-LTO. I was considering to come up with a splitting Pass similar to one that AMD have in llvm-project.

While SPIRVTranslator is the main tool you still need to handle spec constants and ESIMD somewhere in thin-LTO. Most likely, post-split processing part might be extracted in order to be invoked from the library and thin-LTO framework. I didn't think about it much because, first of all, I was focused on moving spec constants to the library.

Copy link
Contributor

@sarnex sarnex Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation. actually for thin-lto we call sycl-post-link early (in -c phase) and the processing happens as usual there. running the processing early seems to be enough for everything besides spec constants (that we know of right now), so that one i will do as part of device link, so i don't think there is a need for an API to call all sycl-post-link post-split processing for thinlto, at least not at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants