-
Notifications
You must be signed in to change notification settings - Fork 197
[CIR][MERGE-SPLIT-COMPILATION] Adds a compilation step that merges Device and Host CIR into a single module and can co-optimize their execution #2097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
koparasy
wants to merge
14
commits into
llvm:main
Choose a base branch
from
koparasy:features/clangir-offload-pipeline
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
0ee10c0
Create cir-combine action placeholder
koparasy cb4c90d
Remove tooling
koparasy f94fbfb
Combine CIR code into a single module
koparasy 8a307be
Allow `cir-combine` to split back to device/host modules
koparasy 9dc8414
Allow lowering to object files through emit-* flags
koparasy 63a8f40
Add ABI support for HIP
koparasy a9476f7
GpuBinaryHandle Attribute is now stringref
koparasy ed0ff3c
Proper parsing of ConstDataArays
koparasy 0e2a637
Read/Write bytecode
koparasy e54f4b8
[WIP][Non-Functional] Make Driver aware of cir combine actions
koparasy 463d7c5
Minor changes to relax action checking
koparasy 7f8c599
Minimum progress done, passes are now correctly create the dependency
koparasy 727b5a2
Properly construct phases (-ccc-print-phases) and print them
koparasy 562e5f0
Allow CUDA support by Moving CIRCombine related methods under CudaAct…
RiverDave File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
|
|
||
| #include "clang/Frontend/FrontendAction.h" | ||
|
|
||
| #include <memory> | ||
|
|
||
| namespace mlir { | ||
| class MLIRContext; | ||
| class ModuleOp; | ||
| } // namespace mlir | ||
|
|
||
| namespace cir { | ||
| class CIRCombineAction : public clang::FrontendAction { | ||
| private: | ||
| mlir::MLIRContext *mlirContext; | ||
|
|
||
| public: | ||
| CIRCombineAction(); | ||
| std::unique_ptr<clang::ASTConsumer> | ||
| CreateASTConsumer(clang::CompilerInstance &CI, | ||
| llvm::StringRef InFile) override { | ||
| return std::make_unique<clang::ASTConsumer>(); | ||
| } | ||
|
|
||
| void ExecuteAction() override; | ||
| // We don't need a preprocessor-only mode. | ||
| bool usesPreprocessorOnly() const override { return false; } | ||
| virtual bool hasCIRSupport() const override { return true; } | ||
| }; | ||
| } // namespace cir |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -76,6 +76,8 @@ class Action { | |
| StaticLibJobClass, | ||
| BinaryAnalyzeJobClass, | ||
| BinaryTranslatorJobClass, | ||
| CIRCombineJobClass, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CIRCombineJobClass -> CIRCombineHostDeviceJobClass |
||
| CIRSplitJobClass, | ||
| ObjcopyJobClass, | ||
|
|
||
| JobClassFirst = PreprocessJobClass, | ||
|
|
@@ -180,8 +182,7 @@ class Action { | |
| /// files for each offloading kind. By default, no prefix is used for | ||
| /// non-device kinds, except if \a CreatePrefixForHost is set. | ||
| static std::string | ||
| GetOffloadingFileNamePrefix(OffloadKind Kind, | ||
| StringRef NormalizedTriple, | ||
| GetOffloadingFileNamePrefix(OffloadKind Kind, StringRef NormalizedTriple, | ||
| bool CreatePrefixForHost = false); | ||
|
|
||
| /// Return a string containing a offload kind name. | ||
|
|
@@ -242,9 +243,7 @@ class InputAction : public Action { | |
| void setId(StringRef _Id) { Id = _Id.str(); } | ||
| StringRef getId() const { return Id; } | ||
|
|
||
| static bool classof(const Action *A) { | ||
| return A->getKind() == InputClass; | ||
| } | ||
| static bool classof(const Action *A) { return A->getKind() == InputClass; } | ||
| }; | ||
|
|
||
| class BindArchAction : public Action { | ||
|
|
@@ -259,9 +258,7 @@ class BindArchAction : public Action { | |
|
|
||
| StringRef getArchName() const { return ArchName; } | ||
|
|
||
| static bool classof(const Action *A) { | ||
| return A->getKind() == BindArchClass; | ||
| } | ||
| static bool classof(const Action *A) { return A->getKind() == BindArchClass; } | ||
| }; | ||
|
|
||
| /// An offload action combines host or/and device actions according to the | ||
|
|
@@ -407,8 +404,7 @@ class JobAction : public Action { | |
|
|
||
| public: | ||
| static bool classof(const Action *A) { | ||
| return (A->getKind() >= JobClassFirst && | ||
| A->getKind() <= JobClassLast); | ||
| return (A->getKind() >= JobClassFirst && A->getKind() <= JobClassLast); | ||
| } | ||
| }; | ||
|
|
||
|
|
@@ -511,9 +507,7 @@ class LinkJobAction : public JobAction { | |
| public: | ||
| LinkJobAction(ActionList &Inputs, types::ID Type); | ||
|
|
||
| static bool classof(const Action *A) { | ||
| return A->getKind() == LinkJobClass; | ||
| } | ||
| static bool classof(const Action *A) { return A->getKind() == LinkJobClass; } | ||
| }; | ||
|
|
||
| class LipoJobAction : public JobAction { | ||
|
|
@@ -522,9 +516,7 @@ class LipoJobAction : public JobAction { | |
| public: | ||
| LipoJobAction(ActionList &Inputs, types::ID Type); | ||
|
|
||
| static bool classof(const Action *A) { | ||
| return A->getKind() == LipoJobClass; | ||
| } | ||
| static bool classof(const Action *A) { return A->getKind() == LipoJobClass; } | ||
| }; | ||
|
|
||
| class DsymutilJobAction : public JobAction { | ||
|
|
@@ -644,6 +636,50 @@ class OffloadPackagerJobAction : public JobAction { | |
| } | ||
| }; | ||
|
|
||
| class CombineCIRJobAction : public JobAction { | ||
| void anchor() override; | ||
| const ToolChain *HostToolChain; | ||
| const ToolChain *DeviceToolChain; | ||
| Action *HostAction; | ||
| Action *DeviceAction; | ||
| char *HostBoundArch; | ||
| const char *DeviceBoundArch; | ||
| unsigned HostOffloadKind; | ||
|
|
||
| public: | ||
| CombineCIRJobAction(const ToolChain *HostToolChain, | ||
| const ToolChain *DeviceToolChain, Action *HostAction, | ||
| Action *DeviceAction, char *HostBoundArch, | ||
| const char *DeviceBoundArch, unsigned HostOffloadKind, | ||
| types::ID Type, OffloadKind OffloadDeviceKind); | ||
|
|
||
| static bool classof(const Action *A) { | ||
| return A->getKind() == CIRCombineJobClass; | ||
| } | ||
|
|
||
| Action *getHostAction() { return HostAction; } | ||
| Action *getDeviceAction() { return DeviceAction; } | ||
|
|
||
| const ToolChain *getHostToolChain() const { return HostToolChain; } | ||
| const ToolChain *getDeviceToolChain() const { return DeviceToolChain; } | ||
|
|
||
| const char *getHostBoundArch() const { return HostBoundArch; } | ||
| const char *getDeviceBoundArch() const { return DeviceBoundArch; } | ||
| }; | ||
|
|
||
| class SplitCIRJobAction : public JobAction { | ||
| void anchor() override; | ||
|
|
||
| public: | ||
| bool isHost; | ||
| SplitCIRJobAction(Action *Input, bool isHost, types::ID Type, | ||
| OffloadKind Kind = OFK_None); | ||
|
|
||
| static bool classof(const Action *A) { | ||
| return A->getKind() == CIRSplitJobClass; | ||
| } | ||
| }; | ||
|
|
||
| class LinkerWrapperJobAction : public JobAction { | ||
| void anchor() override; | ||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally thinking about something like:
Any reason why you need the cir.offload.container wrapping both of them?
Based on our experience with optimizing cross library with LLVM IR dialect, we had to flatten the namespace and put everything in the same module (while adding an extra attribute to indicate the symbol origin, such that we can split them back after optimizations). In our case lots of problems came from symbol definitions being only available within another cir.library and MLIR not being able to properly handle symbol tables across them, is this a problem for this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. My motivation for the
cir.offload.containerwas mostly semantic/correctness: host and device live in different linkage “worlds”, and the same function/global name can legitimately exist on both sides but refer to different entities (different address spaces / ABI / runtime symbol resolution). Keeping them separated avoids accidental symbol capture and makes the “split back” step trivial.That said, I agree with the experience you describe: MLIR symbol-table handling across nested symbol tables can become a constant source of issues, especially when definitions/uses cross the boundary. Flattening into a single module with an explicit origin = host|device attribute (plus deterministic renaming to avoid collisions) is a viable direction, and likely makes cross-host/device optimization passes much simpler. But then again, communication of values across devices usually happens through explicit API calls.
I’m currently not fully convinced which representation is best long-term. For now, my intent with
cir.offload.containeris to keep the semantic separation between host and device explicit while we get the end-to-end combine → split → lower pipeline working correctly. At this stage, I’d prefer not to change the symbol-table model until we have concrete optimization passes that actually require cross-origin symbol resolution.Once that infrastructure is stable, I think it makes sense to revisit flattening into a single module with explicit origin attributes and a renaming scheme, and evaluate whether the complexity trade-off is worth it in practice.