Skip to content

Conversation

@seven-mile
Copy link
Contributor

This patch upstreams SourceLangAttr and its CodeGen logic in the CGM, which encodes the source language in CIR.

@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Aug 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2025

@llvm/pr-subscribers-clang

Author: 7mile (seven-mile)

Changes

This patch upstreams SourceLangAttr and its CodeGen logic in the CGM, which encodes the source language in CIR.


Full diff: https://github.com/llvm/llvm-project/pull/152511.diff

8 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/IR/CIRAttrs.td (+37)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRDialect.td (+1)
  • (modified) clang/lib/CIR/CodeGen/CIRGenModule.cpp (+24)
  • (modified) clang/lib/CIR/CodeGen/CIRGenModule.h (+3)
  • (added) clang/test/CIR/CodeGen/lang-c.c (+8)
  • (added) clang/test/CIR/CodeGen/lang-cpp.cpp (+8)
  • (added) clang/test/CIR/IR/invalid-lang-attr.cir (+5)
  • (added) clang/test/CIR/IR/module.cir (+12)
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
index 588fb0d74a509..2040109298ad1 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
@@ -50,6 +50,43 @@ class CIR_UnitAttr<string name, string attrMnemonic, list<Trait> traits = []>
   let isOptional = 1;
 }
 
+//===----------------------------------------------------------------------===//
+// SourceLanguageAttr
+//===----------------------------------------------------------------------===//
+
+def CIR_SourceLanguage : CIR_I32EnumAttr<"SourceLanguage", "source language", [
+  I32EnumAttrCase<"C", 1, "c">,
+  I32EnumAttrCase<"CXX", 2, "cxx">
+]> {
+  // The enum attr class is defined in `CIR_SourceLanguageAttr` below,
+  // so that it can define extra class methods.
+  let genSpecializedAttr = 0;
+}
+
+def CIR_SourceLanguageAttr : CIR_EnumAttr<CIR_SourceLanguage, "lang"> {
+
+  let summary = "Module source language";
+  let description = [{
+    Represents the source language used to generate the module.
+
+    Example:
+    ```
+    // Module compiled from C.
+    module attributes {cir.lang = cir.lang<c>} {}
+    // Module compiled from C++.
+    module attributes {cir.lang = cir.lang<cxx>} {}
+    ```
+
+    Module source language attribute name is `cir.lang` is defined by
+    `getSourceLanguageAttrName` method in CIRDialect class.
+  }];
+
+  let extraClassDeclaration = [{
+    bool isC() const { return getValue() == SourceLanguage::C; }
+    bool isCXX() const { return getValue() == SourceLanguage::CXX; }
+  }];
+}
+
 //===----------------------------------------------------------------------===//
 // OptInfoAttr
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRDialect.td b/clang/include/clang/CIR/Dialect/IR/CIRDialect.td
index 3fdbf65573b36..c62d63ad42725 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRDialect.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRDialect.td
@@ -35,6 +35,7 @@ def CIR_Dialect : Dialect {
   let hasConstantMaterializer = 1;
 
   let extraClassDeclaration = [{
+    static llvm::StringRef getSourceLanguageAttrName() { return "cir.lang"; }
     static llvm::StringRef getTripleAttrName() { return "cir.triple"; }
     static llvm::StringRef getOptInfoAttrName() { return "cir.opt_info"; }
     static llvm::StringRef getCalleeAttrName() { return "callee"; }
diff --git a/clang/lib/CIR/CodeGen/CIRGenModule.cpp b/clang/lib/CIR/CodeGen/CIRGenModule.cpp
index 425250db87da6..19f4858a7848a 100644
--- a/clang/lib/CIR/CodeGen/CIRGenModule.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenModule.cpp
@@ -102,6 +102,9 @@ CIRGenModule::CIRGenModule(mlir::MLIRContext &mlirContext,
   PtrDiffTy =
       cir::IntType::get(&getMLIRContext(), sizeTypeSize, /*isSigned=*/true);
 
+  theModule->setAttr(
+      cir::CIRDialect::getSourceLanguageAttrName(),
+      cir::SourceLanguageAttr::get(&mlirContext, getCIRSourceLanguage()));
   theModule->setAttr(cir::CIRDialect::getTripleAttrName(),
                      builder.getStringAttr(getTriple().str()));
 
@@ -495,6 +498,27 @@ void CIRGenModule::setNonAliasAttributes(GlobalDecl gd, mlir::Operation *op) {
   assert(!cir::MissingFeatures::setTargetAttributes());
 }
 
+cir::SourceLanguage CIRGenModule::getCIRSourceLanguage() const {
+  using ClangStd = clang::LangStandard;
+  using CIRLang = cir::SourceLanguage;
+  auto opts = getLangOpts();
+
+  if (opts.OpenCL && !opts.OpenCLCPlusPlus)
+    llvm_unreachable("NYI");
+
+  if (opts.CPlusPlus || opts.CPlusPlus11 || opts.CPlusPlus14 ||
+      opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
+      opts.CPlusPlus26)
+    return CIRLang::CXX;
+  if (opts.C99 || opts.C11 || opts.C17 || opts.C23 ||
+      opts.LangStd == ClangStd::lang_c89 ||
+      opts.LangStd == ClangStd::lang_gnu89)
+    return CIRLang::C;
+
+  // TODO(cir): support remaining source languages.
+  llvm_unreachable("CIR does not yet support the given source language");
+}
+
 static void setLinkageForGV(cir::GlobalOp &gv, const NamedDecl *nd) {
   // Set linkage and visibility in case we never see a definition.
   LinkageInfo lv = nd->getLinkageAndVisibility();
diff --git a/clang/lib/CIR/CodeGen/CIRGenModule.h b/clang/lib/CIR/CodeGen/CIRGenModule.h
index 5d07d38012318..cad5afaa92615 100644
--- a/clang/lib/CIR/CodeGen/CIRGenModule.h
+++ b/clang/lib/CIR/CodeGen/CIRGenModule.h
@@ -423,6 +423,9 @@ class CIRGenModule : public CIRGenTypeCache {
   void replacePointerTypeArgs(cir::FuncOp oldF, cir::FuncOp newF);
 
   void setNonAliasAttributes(GlobalDecl gd, mlir::Operation *op);
+
+  /// Map source language used to a CIR attribute.
+  cir::SourceLanguage getCIRSourceLanguage() const;
 };
 } // namespace CIRGen
 
diff --git a/clang/test/CIR/CodeGen/lang-c.c b/clang/test/CIR/CodeGen/lang-c.c
new file mode 100644
index 0000000000000..cae9d23059bd9
--- /dev/null
+++ b/clang/test/CIR/CodeGen/lang-c.c
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+
+// CIR: module attributes {{{.*}}cir.lang = #cir.lang<c>{{.*}}}
+
+int main() {
+  return 0;
+}
diff --git a/clang/test/CIR/CodeGen/lang-cpp.cpp b/clang/test/CIR/CodeGen/lang-cpp.cpp
new file mode 100644
index 0000000000000..561d8b66ab967
--- /dev/null
+++ b/clang/test/CIR/CodeGen/lang-cpp.cpp
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+
+// CIR: module attributes {{{.*}}cir.lang = #cir.lang<cxx>{{.*}}}
+
+int main() {
+  return 0;
+}
diff --git a/clang/test/CIR/IR/invalid-lang-attr.cir b/clang/test/CIR/IR/invalid-lang-attr.cir
new file mode 100644
index 0000000000000..ffe523b1ad401
--- /dev/null
+++ b/clang/test/CIR/IR/invalid-lang-attr.cir
@@ -0,0 +1,5 @@
+// RUN: cir-opt %s -verify-diagnostics
+
+// expected-error@below {{expected ::cir::SourceLanguage to be one of}}
+// expected-error@below {{failed to parse CIR_SourceLanguageAttr parameter 'value'}}
+module attributes {cir.lang = #cir.lang<dummy>} { }
diff --git a/clang/test/CIR/IR/module.cir b/clang/test/CIR/IR/module.cir
new file mode 100644
index 0000000000000..7ce2c0ba21cb0
--- /dev/null
+++ b/clang/test/CIR/IR/module.cir
@@ -0,0 +1,12 @@
+// RUN: cir-opt %s -split-input-file -o %t.cir
+// RUN: FileCheck --input-file=%t.cir %s
+
+// Should parse and print C source language attribute.
+module attributes {cir.lang = #cir.lang<c>} { }
+// CHECK: module attributes {cir.lang = #cir.lang<c>}
+
+// -----
+
+// Should parse and print C++ source language attribute.
+module attributes {cir.lang = #cir.lang<cxx>} { }
+// CHECK: module attributes {cir.lang = #cir.lang<cxx>}

@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2025

@llvm/pr-subscribers-clangir

Author: 7mile (seven-mile)

Changes

This patch upstreams SourceLangAttr and its CodeGen logic in the CGM, which encodes the source language in CIR.


Full diff: https://github.com/llvm/llvm-project/pull/152511.diff

8 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/IR/CIRAttrs.td (+37)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRDialect.td (+1)
  • (modified) clang/lib/CIR/CodeGen/CIRGenModule.cpp (+24)
  • (modified) clang/lib/CIR/CodeGen/CIRGenModule.h (+3)
  • (added) clang/test/CIR/CodeGen/lang-c.c (+8)
  • (added) clang/test/CIR/CodeGen/lang-cpp.cpp (+8)
  • (added) clang/test/CIR/IR/invalid-lang-attr.cir (+5)
  • (added) clang/test/CIR/IR/module.cir (+12)
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
index 588fb0d74a509..2040109298ad1 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
@@ -50,6 +50,43 @@ class CIR_UnitAttr<string name, string attrMnemonic, list<Trait> traits = []>
   let isOptional = 1;
 }
 
+//===----------------------------------------------------------------------===//
+// SourceLanguageAttr
+//===----------------------------------------------------------------------===//
+
+def CIR_SourceLanguage : CIR_I32EnumAttr<"SourceLanguage", "source language", [
+  I32EnumAttrCase<"C", 1, "c">,
+  I32EnumAttrCase<"CXX", 2, "cxx">
+]> {
+  // The enum attr class is defined in `CIR_SourceLanguageAttr` below,
+  // so that it can define extra class methods.
+  let genSpecializedAttr = 0;
+}
+
+def CIR_SourceLanguageAttr : CIR_EnumAttr<CIR_SourceLanguage, "lang"> {
+
+  let summary = "Module source language";
+  let description = [{
+    Represents the source language used to generate the module.
+
+    Example:
+    ```
+    // Module compiled from C.
+    module attributes {cir.lang = cir.lang<c>} {}
+    // Module compiled from C++.
+    module attributes {cir.lang = cir.lang<cxx>} {}
+    ```
+
+    Module source language attribute name is `cir.lang` is defined by
+    `getSourceLanguageAttrName` method in CIRDialect class.
+  }];
+
+  let extraClassDeclaration = [{
+    bool isC() const { return getValue() == SourceLanguage::C; }
+    bool isCXX() const { return getValue() == SourceLanguage::CXX; }
+  }];
+}
+
 //===----------------------------------------------------------------------===//
 // OptInfoAttr
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRDialect.td b/clang/include/clang/CIR/Dialect/IR/CIRDialect.td
index 3fdbf65573b36..c62d63ad42725 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRDialect.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRDialect.td
@@ -35,6 +35,7 @@ def CIR_Dialect : Dialect {
   let hasConstantMaterializer = 1;
 
   let extraClassDeclaration = [{
+    static llvm::StringRef getSourceLanguageAttrName() { return "cir.lang"; }
     static llvm::StringRef getTripleAttrName() { return "cir.triple"; }
     static llvm::StringRef getOptInfoAttrName() { return "cir.opt_info"; }
     static llvm::StringRef getCalleeAttrName() { return "callee"; }
diff --git a/clang/lib/CIR/CodeGen/CIRGenModule.cpp b/clang/lib/CIR/CodeGen/CIRGenModule.cpp
index 425250db87da6..19f4858a7848a 100644
--- a/clang/lib/CIR/CodeGen/CIRGenModule.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenModule.cpp
@@ -102,6 +102,9 @@ CIRGenModule::CIRGenModule(mlir::MLIRContext &mlirContext,
   PtrDiffTy =
       cir::IntType::get(&getMLIRContext(), sizeTypeSize, /*isSigned=*/true);
 
+  theModule->setAttr(
+      cir::CIRDialect::getSourceLanguageAttrName(),
+      cir::SourceLanguageAttr::get(&mlirContext, getCIRSourceLanguage()));
   theModule->setAttr(cir::CIRDialect::getTripleAttrName(),
                      builder.getStringAttr(getTriple().str()));
 
@@ -495,6 +498,27 @@ void CIRGenModule::setNonAliasAttributes(GlobalDecl gd, mlir::Operation *op) {
   assert(!cir::MissingFeatures::setTargetAttributes());
 }
 
+cir::SourceLanguage CIRGenModule::getCIRSourceLanguage() const {
+  using ClangStd = clang::LangStandard;
+  using CIRLang = cir::SourceLanguage;
+  auto opts = getLangOpts();
+
+  if (opts.OpenCL && !opts.OpenCLCPlusPlus)
+    llvm_unreachable("NYI");
+
+  if (opts.CPlusPlus || opts.CPlusPlus11 || opts.CPlusPlus14 ||
+      opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
+      opts.CPlusPlus26)
+    return CIRLang::CXX;
+  if (opts.C99 || opts.C11 || opts.C17 || opts.C23 ||
+      opts.LangStd == ClangStd::lang_c89 ||
+      opts.LangStd == ClangStd::lang_gnu89)
+    return CIRLang::C;
+
+  // TODO(cir): support remaining source languages.
+  llvm_unreachable("CIR does not yet support the given source language");
+}
+
 static void setLinkageForGV(cir::GlobalOp &gv, const NamedDecl *nd) {
   // Set linkage and visibility in case we never see a definition.
   LinkageInfo lv = nd->getLinkageAndVisibility();
diff --git a/clang/lib/CIR/CodeGen/CIRGenModule.h b/clang/lib/CIR/CodeGen/CIRGenModule.h
index 5d07d38012318..cad5afaa92615 100644
--- a/clang/lib/CIR/CodeGen/CIRGenModule.h
+++ b/clang/lib/CIR/CodeGen/CIRGenModule.h
@@ -423,6 +423,9 @@ class CIRGenModule : public CIRGenTypeCache {
   void replacePointerTypeArgs(cir::FuncOp oldF, cir::FuncOp newF);
 
   void setNonAliasAttributes(GlobalDecl gd, mlir::Operation *op);
+
+  /// Map source language used to a CIR attribute.
+  cir::SourceLanguage getCIRSourceLanguage() const;
 };
 } // namespace CIRGen
 
diff --git a/clang/test/CIR/CodeGen/lang-c.c b/clang/test/CIR/CodeGen/lang-c.c
new file mode 100644
index 0000000000000..cae9d23059bd9
--- /dev/null
+++ b/clang/test/CIR/CodeGen/lang-c.c
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+
+// CIR: module attributes {{{.*}}cir.lang = #cir.lang<c>{{.*}}}
+
+int main() {
+  return 0;
+}
diff --git a/clang/test/CIR/CodeGen/lang-cpp.cpp b/clang/test/CIR/CodeGen/lang-cpp.cpp
new file mode 100644
index 0000000000000..561d8b66ab967
--- /dev/null
+++ b/clang/test/CIR/CodeGen/lang-cpp.cpp
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+
+// CIR: module attributes {{{.*}}cir.lang = #cir.lang<cxx>{{.*}}}
+
+int main() {
+  return 0;
+}
diff --git a/clang/test/CIR/IR/invalid-lang-attr.cir b/clang/test/CIR/IR/invalid-lang-attr.cir
new file mode 100644
index 0000000000000..ffe523b1ad401
--- /dev/null
+++ b/clang/test/CIR/IR/invalid-lang-attr.cir
@@ -0,0 +1,5 @@
+// RUN: cir-opt %s -verify-diagnostics
+
+// expected-error@below {{expected ::cir::SourceLanguage to be one of}}
+// expected-error@below {{failed to parse CIR_SourceLanguageAttr parameter 'value'}}
+module attributes {cir.lang = #cir.lang<dummy>} { }
diff --git a/clang/test/CIR/IR/module.cir b/clang/test/CIR/IR/module.cir
new file mode 100644
index 0000000000000..7ce2c0ba21cb0
--- /dev/null
+++ b/clang/test/CIR/IR/module.cir
@@ -0,0 +1,12 @@
+// RUN: cir-opt %s -split-input-file -o %t.cir
+// RUN: FileCheck --input-file=%t.cir %s
+
+// Should parse and print C source language attribute.
+module attributes {cir.lang = #cir.lang<c>} { }
+// CHECK: module attributes {cir.lang = #cir.lang<c>}
+
+// -----
+
+// Should parse and print C++ source language attribute.
+module attributes {cir.lang = #cir.lang<cxx>} { }
+// CHECK: module attributes {cir.lang = #cir.lang<cxx>}

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I have a couple of suggestions, but they can be ignored or put off until later if you like.

@@ -0,0 +1,8 @@
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be combined with the lang-c.c test using the -x c option.

if (opts.CPlusPlus || opts.CPlusPlus11 || opts.CPlusPlus14 ||
opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
opts.CPlusPlus26)
return CIRLang::CXX;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to also encode the language standard used? That's not necessary for this PR, just something to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I'll give it a try in a later patch.

Copy link
Contributor Author

@seven-mile seven-mile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Suggestions applied.

if (opts.CPlusPlus || opts.CPlusPlus11 || opts.CPlusPlus14 ||
opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
opts.CPlusPlus26)
return CIRLang::CXX;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I'll give it a try in a later patch.

Copy link
Contributor

@xlauko xlauko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, with minor nits

Comment on lines 506 to 508
if (opts.CPlusPlus || opts.CPlusPlus11 || opts.CPlusPlus14 ||
opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
opts.CPlusPlus26)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope opts.CPlusPlus implies arbitrary standard. The additional checks are redundant. This could be simplified to just?

Suggested change
if (opts.CPlusPlus || opts.CPlusPlus11 || opts.CPlusPlus14 ||
opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
opts.CPlusPlus26)
if (opts.CPlusPlus)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! Updated.

opts.CPlusPlus17 || opts.CPlusPlus20 || opts.CPlusPlus23 ||
opts.CPlusPlus26)
return CIRLang::CXX;
if (opts.C99 || opts.C11 || opts.C17 || opts.C23 ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (opts.C99 || opts.C11 || opts.C17 || opts.C23 ||
if (opts.C99 || opts.C11 || opts.C17 || opts.C23 || opts.C2y

@bcardosolopes
Copy link
Member

@seven-mile do you need help landing this? btw looks like it needs an update to wrap up testing

@seven-mile
Copy link
Contributor Author

@bcardosolopes Yes, I don't have access, please help me get the patch committed😉. CI test should be triggerred now.

@andykaylor andykaylor merged commit 761125f into llvm:main Aug 21, 2025
9 checks passed

// TODO(cir): support remaining source languages.
assert(!cir::MissingFeatures::sourceLanguageCases());
errorNYI("CIR does not yet support the given source language");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to return a value? @bcardosolopes Should we adda CIRLang::Other?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants