Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions clang/include/clang/Basic/DiagnosticDriverKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,9 @@ def warn_drv_unsupported_diag_option_for_flang : Warning<
def warn_drv_unsupported_option_for_processor : Warning<
"ignoring '%0' option as it is not currently supported for processor '%1'">,
InGroup<OptionIgnored>;
def warn_drv_unsupported_option_overrides_option : Warning<
"ignoring '%0' option as option '%1' overrides the behavior">,
InGroup<OptionIgnored>;
def warn_drv_unsupported_openmp_library : Warning<
"the library '%0=%1' is not supported, OpenMP will not be enabled">,
InGroup<OptionIgnored>;
Expand Down
4 changes: 2 additions & 2 deletions clang/include/clang/CIR/Dialect/IR/CIRCUDAAttrs.td
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def CIR_CUDAKernelNameAttr : CIR_Attr<"CUDAKernelName", "cu.kernel_name"> {
names, we must record the corresponding device-side name for a stub.
}];

let parameters = (ins "std::string":$kernel_name);
let parameters = (ins StringRefParameter<"">:$kernel_name);
let assemblyFormat = "`<` $kernel_name `>`";
}

Expand Down Expand Up @@ -65,7 +65,7 @@ def CIR_CUDABinaryHandleAttr : CIR_Attr<
and then generate various registration functions.
}];

let parameters = (ins "std::string":$name);
let parameters = (ins StringRefParameter<"">:$name);
let assemblyFormat = "`<` $name `>`";
}

Expand Down
52 changes: 52 additions & 0 deletions clang/include/clang/CIR/Dialect/IR/CIROps.td
Original file line number Diff line number Diff line change
Expand Up @@ -3992,6 +3992,58 @@ def CIR_DerivedMethodOp : CIR_Op<"derived_method", [Pure]> {
let hasVerifier = 1;
}


//===----------------------------------------------------------------------===//
// Offload container
//===----------------------------------------------------------------------===//

def CIR_OffloadContainerOp : CIR_Op<"offload.container",
[NoRegionArguments, NoTerminator]> {
let summary = "Container for host and device CIR modules";
let description = [{
`cir.offload.container` is a top-level container used to keep host and device
CIR modules together for joint analysis and transformation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking about something like:

module {
  cir.host {
     ...
  }

  cir.device {
    ...
  }
}

Any reason why you need the cir.offload.container wrapping both of them?
Based on our experience with optimizing cross library with LLVM IR dialect, we had to flatten the namespace and put everything in the same module (while adding an extra attribute to indicate the symbol origin, such that we can split them back after optimizations). In our case lots of problems came from symbol definitions being only available within another cir.library and MLIR not being able to properly handle symbol tables across them, is this a problem for this approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. My motivation for the cir.offload.container was mostly semantic/correctness: host and device live in different linkage “worlds”, and the same function/global name can legitimately exist on both sides but refer to different entities (different address spaces / ABI / runtime symbol resolution). Keeping them separated avoids accidental symbol capture and makes the “split back” step trivial.

That said, I agree with the experience you describe: MLIR symbol-table handling across nested symbol tables can become a constant source of issues, especially when definitions/uses cross the boundary. Flattening into a single module with an explicit origin = host|device attribute (plus deterministic renaming to avoid collisions) is a viable direction, and likely makes cross-host/device optimization passes much simpler. But then again, communication of values across devices usually happens through explicit API calls.

I’m currently not fully convinced which representation is best long-term. For now, my intent with cir.offload.container is to keep the semantic separation between host and device explicit while we get the end-to-end combine → split → lower pipeline working correctly. At this stage, I’d prefer not to change the symbol-table model until we have concrete optimization passes that actually require cross-origin symbol resolution.

Once that infrastructure is stable, I think it makes sense to revisit flattening into a single module with explicit origin attributes and a renaming scheme, and evaluate whether the complexity trade-off is worth it in practice.


The operation owns a single region. The region typically contains nested
`module` operations such as `module @host { ... }` and `module @device { ... }`,
each providing its own symbol table scope to avoid host/device symbol conflicts.

Example:

```mlir
module {
cir.offload.container {
module @host {
// host CIR
}
module @device {
// device CIR
}
}
}
```
}];

let arguments = (ins);

let regions = (region AnyRegion:$body);

// Use generic region printer/parser: `cir.offload.container { ... }`
let assemblyFormat = [{
$body attr-dict
}];

let hasVerifier = 1;

let extraClassDeclaration = [{
/// Returns the nested `module @host` if present, otherwise nullopt.
std::optional<mlir::ModuleOp> getHostModule();

/// Returns nested `module @device` module.
std::optional<mlir::ModuleOp> getDeviceModule();
}];
}

//===----------------------------------------------------------------------===//
// FuncOp
//===----------------------------------------------------------------------===//
Expand Down
29 changes: 29 additions & 0 deletions clang/include/clang/CIR/FrontendAction/CIRCombineAction.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@

#include "clang/Frontend/FrontendAction.h"

#include <memory>

namespace mlir {
class MLIRContext;
class ModuleOp;
} // namespace mlir

namespace cir {
class CIRCombineAction : public clang::FrontendAction {
private:
mlir::MLIRContext *mlirContext;

public:
CIRCombineAction();
std::unique_ptr<clang::ASTConsumer>
CreateASTConsumer(clang::CompilerInstance &CI,
llvm::StringRef InFile) override {
return std::make_unique<clang::ASTConsumer>();
}

void ExecuteAction() override;
// We don't need a preprocessor-only mode.
bool usesPreprocessorOnly() const override { return false; }
virtual bool hasCIRSupport() const override { return true; }
};
} // namespace cir
6 changes: 2 additions & 4 deletions clang/include/clang/CIR/FrontendAction/CIRGenAction.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_CIR_CIRGENACTION_H
#define LLVM_CLANG_CIR_CIRGENACTION_H
#ifndef LLVM_CLANG_CIR_CIRCOMBINEACTION_H
#define LLVM_CLANG_CIR_CIRCOMBINEACTION_H

#include "clang/CodeGen/CodeGenAction.h"
#include "clang/Frontend/FrontendAction.h"
Expand Down Expand Up @@ -49,8 +49,6 @@ class CIRGenAction : public clang::ASTFrontendAction {

mlir::MLIRContext *mlirContext;

mlir::OwningOpRef<mlir::ModuleOp> loadModule(llvm::MemoryBufferRef mbRef);

protected:
CIRGenAction(OutputType action, mlir::MLIRContext *_MLIRContext = nullptr);

Expand Down
68 changes: 52 additions & 16 deletions clang/include/clang/Driver/Action.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ class Action {
StaticLibJobClass,
BinaryAnalyzeJobClass,
BinaryTranslatorJobClass,
CIRCombineJobClass,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CIRCombineJobClass -> CIRCombineHostDeviceJobClass

CIRSplitJobClass,
ObjcopyJobClass,

JobClassFirst = PreprocessJobClass,
Expand Down Expand Up @@ -180,8 +182,7 @@ class Action {
/// files for each offloading kind. By default, no prefix is used for
/// non-device kinds, except if \a CreatePrefixForHost is set.
static std::string
GetOffloadingFileNamePrefix(OffloadKind Kind,
StringRef NormalizedTriple,
GetOffloadingFileNamePrefix(OffloadKind Kind, StringRef NormalizedTriple,
bool CreatePrefixForHost = false);

/// Return a string containing a offload kind name.
Expand Down Expand Up @@ -242,9 +243,7 @@ class InputAction : public Action {
void setId(StringRef _Id) { Id = _Id.str(); }
StringRef getId() const { return Id; }

static bool classof(const Action *A) {
return A->getKind() == InputClass;
}
static bool classof(const Action *A) { return A->getKind() == InputClass; }
};

class BindArchAction : public Action {
Expand All @@ -259,9 +258,7 @@ class BindArchAction : public Action {

StringRef getArchName() const { return ArchName; }

static bool classof(const Action *A) {
return A->getKind() == BindArchClass;
}
static bool classof(const Action *A) { return A->getKind() == BindArchClass; }
};

/// An offload action combines host or/and device actions according to the
Expand Down Expand Up @@ -407,8 +404,7 @@ class JobAction : public Action {

public:
static bool classof(const Action *A) {
return (A->getKind() >= JobClassFirst &&
A->getKind() <= JobClassLast);
return (A->getKind() >= JobClassFirst && A->getKind() <= JobClassLast);
}
};

Expand Down Expand Up @@ -511,9 +507,7 @@ class LinkJobAction : public JobAction {
public:
LinkJobAction(ActionList &Inputs, types::ID Type);

static bool classof(const Action *A) {
return A->getKind() == LinkJobClass;
}
static bool classof(const Action *A) { return A->getKind() == LinkJobClass; }
};

class LipoJobAction : public JobAction {
Expand All @@ -522,9 +516,7 @@ class LipoJobAction : public JobAction {
public:
LipoJobAction(ActionList &Inputs, types::ID Type);

static bool classof(const Action *A) {
return A->getKind() == LipoJobClass;
}
static bool classof(const Action *A) { return A->getKind() == LipoJobClass; }
};

class DsymutilJobAction : public JobAction {
Expand Down Expand Up @@ -644,6 +636,50 @@ class OffloadPackagerJobAction : public JobAction {
}
};

class CombineCIRJobAction : public JobAction {
void anchor() override;
const ToolChain *HostToolChain;
const ToolChain *DeviceToolChain;
Action *HostAction;
Action *DeviceAction;
char *HostBoundArch;
const char *DeviceBoundArch;
unsigned HostOffloadKind;

public:
CombineCIRJobAction(const ToolChain *HostToolChain,
const ToolChain *DeviceToolChain, Action *HostAction,
Action *DeviceAction, char *HostBoundArch,
const char *DeviceBoundArch, unsigned HostOffloadKind,
types::ID Type, OffloadKind OffloadDeviceKind);

static bool classof(const Action *A) {
return A->getKind() == CIRCombineJobClass;
}

Action *getHostAction() { return HostAction; }
Action *getDeviceAction() { return DeviceAction; }

const ToolChain *getHostToolChain() const { return HostToolChain; }
const ToolChain *getDeviceToolChain() const { return DeviceToolChain; }

const char *getHostBoundArch() const { return HostBoundArch; }
const char *getDeviceBoundArch() const { return DeviceBoundArch; }
};

class SplitCIRJobAction : public JobAction {
void anchor() override;

public:
bool isHost;
SplitCIRJobAction(Action *Input, bool isHost, types::ID Type,
OffloadKind Kind = OFK_None);

static bool classof(const Action *A) {
return A->getKind() == CIRSplitJobClass;
}
};

class LinkerWrapperJobAction : public JobAction {
void anchor() override;

Expand Down
27 changes: 27 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -3156,6 +3156,13 @@ defm clangir : BoolFOption<"clangir",
PosFlag<SetTrue, [], [ClangOption, CC1Option], "Use the ClangIR pipeline to compile">,
NegFlag<SetFalse, [], [ClangOption, CC1Option], "Use the AST -> LLVM pipeline to compile">,
BothFlags<[], [ClangOption, CC1Option], "">>;
defm clangir_offload : BoolFOption<"clangir-offload",
FrontendOpts<"UseClangIROffloadPipeline">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption, CC1Option],
"Use the ClangIR-based offload compilation pipeline">,
NegFlag<SetFalse, [], [ClangOption, CC1Option],
"Disable the ClangIR-based offload compilation pipeline">,
BothFlags<[], [ClangOption, CC1Option], "">>;
def fcir_output_EQ : Joined<["-"], "fcir-output=">, Group<f_Group>,
Visibility<[ClangOption, CC1Option]>,
Flags<[NoArgumentUnused]>,
Expand Down Expand Up @@ -3208,6 +3215,7 @@ def fclangir_mem2reg : Flag<["-"], "fclangir-mem2reg">,
HelpText<"Enable mem2reg on the flat ClangIR">,
MarshallingInfoFlag<FrontendOpts<"ClangIREnableMem2Reg">>;

def CIR_Group : OptionGroup<"<CIR options>">;
def clangir_disable_passes : Flag<["-"], "clangir-disable-passes">,
Visibility<[ClangOption, CC1Option]>,
HelpText<"Disable CIR transformations pipeline">,
Expand Down Expand Up @@ -3256,6 +3264,25 @@ def emit_cir_only : Flag<["-"], "emit-cir-only">,
def emit_cir_flat : Flag<["-"], "emit-cir-flat">, Visibility<[ClangOption, CC1Option]>,
Group<Action_Group>, Alias<emit_mlir_EQ>, AliasArgs<["cir-flat"]>,
HelpText<"Similar to -emit-cir but also lowers structured CFG into basic blocks.">;
def cir_combine : Flag<["-"], "cir-combine">,
Visibility<[CC1Option]>,
Group<Action_Group>,
HelpText<"Combine host and device CIR modules into a single offload container CIR module.">;
def cir_host_input : Separate<["-"], "cir-host-input">,
Visibility<[CC1Option]>, Group<CIR_Group>,
HelpText<"Host CIR input for -cir-combine.">;
def cir_device_input : Separate<["-"], "cir-device-input">,
Visibility<[CC1Option]>, Group<CIR_Group>,
HelpText<"Device CIR input for -cir-combine (may be repeated).">;
def cir_emit_split : Flag<["-"], "cir-emit-split">,
Visibility<[CC1Option]>, Group<CIR_Group>,
HelpText<"Emit split host/device CIR instead of a combined CIR container">;
def cir_host_output : Separate<["-"], "cir-host-output">,
Visibility<[CC1Option]>, Group<CIR_Group>,
HelpText<"Output path for host CIR when -cir-emit-split is used">;
def cir_device_output : Separate<["-"], "cir-device-output">,
Visibility<[CC1Option]>, Group<CIR_Group>,
HelpText<"Output path for device CIR when -cir-emit-split is used">;
/// ClangIR-specific options - END

def flto : Flag<["-"], "flto">,
Expand Down
17 changes: 15 additions & 2 deletions clang/include/clang/Frontend/FrontendOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,10 @@ enum ActionKind {
/// Generate CIR, bud don't emit anything.
EmitCIROnly,

/// Combine multiple CIR modules (e.g. host and device) into a single
/// container
CIRCombine,

/// Emit a .mlir file
EmitMLIR,

Expand Down Expand Up @@ -413,6 +417,11 @@ class FrontendOptions {
LLVM_PREFERRED_TYPE(bool)
unsigned UseClangIRPipeline : 1;

/// Use CIR-based offload pipeline (combine/split/fatbin/embed) when compiling
/// offload code.
LLVM_PREFERRED_TYPE(bool)
unsigned UseClangIROffloadPipeline : 1;

/// Lower directly from ClangIR to LLVM
unsigned ClangIRDirectLowering : 1;

Expand Down Expand Up @@ -454,6 +463,11 @@ class FrontendOptions {
std::string ClangIRIdiomRecognizerOpts;
std::string ClangIRLibOptOpts;
std::string ClangIRFile;
std::string CIRHostInput;
std::string CIRDeviceInput;
bool EmitSplit;
std::string CIRHostOutput;
std::string CIRDeviceOutput;

frontend::MLIRDialectKind MLIRTargetDialect = frontend::MLIR_CORE;

Expand Down Expand Up @@ -532,7 +546,6 @@ class FrontendOptions {
/// should only be used for debugging and experimental features.
std::vector<std::string> MLIRArgs;


/// File name of the file that will provide record layouts
/// (in the format produced by -fdump-record-layouts).
std::string OverrideRecordLayoutsFile;
Expand Down Expand Up @@ -587,7 +600,7 @@ class FrontendOptions {
ClangIRVerifyDiags(false), ClangIRLifetimeCheck(false),
ClangIRIdiomRecognizer(false), ClangIRLibOpt(false),
ClangIRCallConvLowering(true), ClangIREnableMem2Reg(false),
ClangIRAnalysisOnly(false), EmitClangIRFile(false),
ClangIRAnalysisOnly(false), EmitClangIRFile(false), EmitSplit(false),
TimeTraceGranularity(500), TimeTraceVerbose(false) {}

/// getInputKindForExtension - Return the appropriate input kind for a file
Expand Down
23 changes: 23 additions & 0 deletions clang/lib/CIR/CodeGen/TargetInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,29 @@ class AMDGPUTargetCIRGenInfo : public TargetCIRGenInfo {
ft = getABIInfo().getContext().adjustFunctionType(
ft, ft->getExtInfo().withCallingConv(CC_DeviceKernel));
}

void setTargetAttributes(const clang::Decl *decl, mlir::Operation *global,
CIRGenModule &cgm) const override {

if (const auto *fd = clang::dyn_cast_or_null<clang::FunctionDecl>(decl)) {
cir::FuncOp func = mlir::cast<cir::FuncOp>(global);
if (func.isDeclaration())
return;

if (cgm.getLangOpts().HIP) {
if (fd->hasAttr<CUDAGlobalAttr>()) {
func.setCallingConv(cir::CallingConv::AMDGPUKernel);
func.setLinkageAttr(cir::GlobalLinkageKindAttr::get(
func.getContext(), cir::GlobalLinkageKind::ExternalLinkage));
func.setVisibility(mlir::SymbolTable::Visibility::Public);
func.setGlobalVisibility(cir::VisibilityKind::Protected);
}
}

if (fd->getAttr<CUDALaunchBoundsAttr>())
llvm_unreachable("NYI");
}
}
};

} // namespace
Expand Down
Loading