Skip to content

Conversation

@mmha
Copy link
Contributor

@mmha mmha commented Apr 10, 2025

This patch adds VisitBinAssign and VisitBinComma to the ClangIR ScalarExprEmitter to enable assignments and the comma operator.

@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Apr 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 10, 2025

@llvm/pr-subscribers-clangir

@llvm/pr-subscribers-clang

Author: Morris Hafner (mmha)

Changes

This patch adds VisitBinAssign and VisitBinComma to the ClangIR ScalarExprEmitter to enable assignments and the comma operator.


Full diff: https://github.com/llvm/llvm-project/pull/135115.diff

8 Files Affected:

  • (modified) clang/include/clang/CIR/MissingFeatures.h (+1)
  • (modified) clang/lib/CIR/CodeGen/CIRGenDecl.cpp (+8)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+7)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp (+60)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+8)
  • (added) clang/test/CIR/CodeGen/binassign.c (+56)
  • (added) clang/test/CIR/CodeGen/comma.c (+53)
  • (added) clang/test/CIR/IR/binassign.cir (+45)
diff --git a/clang/include/clang/CIR/MissingFeatures.h b/clang/include/clang/CIR/MissingFeatures.h
index 3188429ea3b1b..19cd9c03635b3 100644
--- a/clang/include/clang/CIR/MissingFeatures.h
+++ b/clang/include/clang/CIR/MissingFeatures.h
@@ -113,6 +113,7 @@ struct MissingFeatures {
   static bool incrementProfileCounter() { return false; }
   static bool insertBuiltinUnpredictable() { return false; }
   static bool objCGC() { return false; }
+  static bool bitfields() { return false; }
 
   // Missing types
   static bool dataMemberType() { return false; }
diff --git a/clang/lib/CIR/CodeGen/CIRGenDecl.cpp b/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
index b8e72f299acb6..58797c5dd253a 100644
--- a/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
@@ -277,3 +277,11 @@ void CIRGenFunction::emitDecl(const Decl &d) {
     cgm.errorNYI(d.getSourceRange(), "emitDecl: unhandled decl type");
   }
 }
+
+void CIRGenFunction::emitNullabilityCheck(LValue lhs, mlir::Value rhs,
+                                          SourceLocation loc) {
+  if (!sanOpts.has(SanitizerKind::NullabilityAssign))
+    return;
+
+  assert(!cir::MissingFeatures::sanitizers());
+}
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index 4b6652ad0b9e6..66caea26c3636 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -219,6 +219,13 @@ void CIRGenFunction::emitStoreOfScalar(mlir::Value value, Address addr,
   assert(!cir::MissingFeatures::opTBAA());
 }
 
+void CIRGenFunction::emitStoreThroughBitfieldLValue(RValue src, LValue dst,
+                                                    mlir::Value &result) {
+  assert(!cir::MissingFeatures::bitfields());
+  cgm.errorNYI("bitfields");
+  result = {};
+}
+
 mlir::Value CIRGenFunction::emitToMemory(mlir::Value value, QualType ty) {
   // Bool has a different representation in memory than in registers,
   // but in ClangIR, it is simply represented as a cir.bool value.
diff --git a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
index 4042f5dc23e4b..955082047bd38 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
@@ -21,6 +21,7 @@
 #include "mlir/IR/Value.h"
 
 #include <cassert>
+#include <utility>
 
 using namespace clang;
 using namespace clang::CIRGen;
@@ -807,6 +808,65 @@ class ScalarExprEmitter : public StmtVisitor<ScalarExprEmitter, mlir::Value> {
   VISITCOMP(EQ)
   VISITCOMP(NE)
 #undef VISITCOMP
+
+  mlir::Value VisitBinAssign(const BinaryOperator *e) {
+    const bool ignore = std::exchange(ignoreResultAssign, false);
+
+    mlir::Value rhs;
+    LValue lhs;
+
+    switch (e->getLHS()->getType().getObjCLifetime()) {
+    case Qualifiers::OCL_Strong:
+    case Qualifiers::OCL_Autoreleasing:
+    case Qualifiers::OCL_ExplicitNone:
+    case Qualifiers::OCL_Weak:
+      assert(!cir::MissingFeatures::objCLifetime());
+      break;
+    case Qualifiers::OCL_None:
+      // __block variables need to have the rhs evaluated first, plus this
+      // should improve codegen just a little.
+      rhs = Visit(e->getRHS());
+      assert(!cir::MissingFeatures::sanitizers());
+      // TODO(cir): This needs to be emitCheckedLValue() once we support
+      // sanitizers
+      lhs = cgf.emitLValue(e->getLHS());
+
+      // Store the value into the LHS. Bit-fields are handled specially because
+      // the result is altered by the store, i.e., [C99 6.5.16p1]
+      // 'An assignment expression has the value of the left operand after the
+      // assignment...'.
+      if (lhs.isBitField()) {
+        cgf.emitStoreThroughBitfieldLValue(RValue::get(rhs), lhs, rhs);
+      } else {
+        cgf.emitNullabilityCheck(lhs, rhs, e->getExprLoc());
+        CIRGenFunction::SourceLocRAIIObject loc{
+            cgf, cgf.getLoc(e->getSourceRange())};
+        cgf.emitStoreThroughLValue(RValue::get(rhs), lhs);
+      }
+    }
+
+    // If the result is clearly ignored, return now.
+    if (ignore)
+      return nullptr;
+
+    // The result of an assignment in C is the assigned r-value.
+    if (!cgf.getLangOpts().CPlusPlus)
+      return rhs;
+
+    // If the lvalue is non-volatile, return the computed value of the
+    // assignment.
+    if (!lhs.isVolatile())
+      return rhs;
+
+    // Otherwise, reload the value.
+    return emitLoadOfLValue(lhs, e->getExprLoc());
+  }
+
+  mlir::Value VisitBinComma(const BinaryOperator *e) {
+    cgf.emitIgnoredExpr(e->getLHS());
+    // NOTE: We don't need to EnsureInsertPoint() like LLVM codegen.
+    return Visit(e->getRHS());
+  }
 };
 
 LValue ScalarExprEmitter::emitCompoundAssignLValue(
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index 6ffa106f2a383..7fb34680d8a9f 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -558,6 +558,9 @@ class CIRGenFunction : public CIRGenTypeCache {
   /// is 'Ty'.
   void emitStoreThroughLValue(RValue src, LValue dst, bool isInit = false);
 
+  void emitStoreThroughBitfieldLValue(RValue src, LValue dst,
+                                      mlir::Value &result);
+
   /// Given a value and its clang type, returns the value casted to its memory
   /// representation.
   /// Note: CIR defers most of the special casting to the final lowering passes
@@ -572,6 +575,11 @@ class CIRGenFunction : public CIRGenTypeCache {
 
   mlir::LogicalResult emitWhileStmt(const clang::WhileStmt &s);
 
+  /// Given an assignment `*lhs = rhs`, emit a test that checks if \p rhs is
+  /// nonnull, if 1\p LHS is marked _Nonnull.
+  void emitNullabilityCheck(LValue lhs, mlir::Value rhs,
+                            clang::SourceLocation loc);
+
   /// ----------------------
   /// CIR build helpers
   /// -----------------
diff --git a/clang/test/CIR/CodeGen/binassign.c b/clang/test/CIR/CodeGen/binassign.c
new file mode 100644
index 0000000000000..5a7f0b7fd3f79
--- /dev/null
+++ b/clang/test/CIR/CodeGen/binassign.c
@@ -0,0 +1,56 @@
+// RUN: %clang_cc1 -std=c23 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --input-file=%t.cir %s -check-prefix=CIR
+// RUN: %clang_cc1 -std=c23 -triple x86_64-unknown-linux-gnu -Wno-unused-value -fclangir -emit-llvm %s -o %t-cir.ll
+// RUN: FileCheck --input-file=%t-cir.ll %s -check-prefix=LLVM
+// RUN: %clang_cc1 -std=c23 -triple x86_64-unknown-linux-gnu -Wno-unused-value -emit-llvm %s -o %t.ll
+// RUN: FileCheck --input-file=%t.ll %s -check-prefix=OGCG
+
+void binary_assign(void) {
+    bool b;
+    char c;
+    float f;
+    int i;
+
+    b = true;
+    c = 65;
+    f = 3.14f;
+    i = 42;
+}
+
+// CIR-LABEL: cir.func @binary_assign() {
+// CIR:         %[[B:.*]] = cir.alloca !cir.bool, !cir.ptr<!cir.bool>, ["b"]
+// CIR:         %[[C:.*]] = cir.alloca !s8i, !cir.ptr<!s8i>, ["c"]
+// CIR:         %[[F:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["f"]
+// CIR:         %[[I:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["i"]
+// CIR:         %[[TRUE:.*]] = cir.const #true
+// CIR:         cir.store %[[TRUE]], %[[B]] : !cir.bool, !cir.ptr<!cir.bool>
+// CIR:         %[[CHAR_INI_INIT:.*]] = cir.const #cir.int<65> : !s32i
+// CIR:         %[[CHAR_VAL:.*]] = cir.cast(integral, %[[CHAR_INI_INIT]] : !s32i), !s8i
+// CIR:         cir.store %[[CHAR_VAL]], %[[C]] : !s8i, !cir.ptr<!s8i>
+// CIR:         %[[FLOAT_VAL:.*]] = cir.const #cir.fp<3.140000e+00> : !cir.float
+// CIR:         cir.store %[[FLOAT_VAL]], %[[F]] : !cir.float, !cir.ptr<!cir.float>
+// CIR:         %[[INT_VAL:.*]] = cir.const #cir.int<42> : !s32i
+// CIR:         cir.store %[[INT_VAL]], %[[I]] : !s32i, !cir.ptr<!s32i>
+// CIR:         cir.return
+
+// LLVM-LABEL: define {{.*}}void @binary_assign() {
+// LLVM:         %[[B_PTR:.*]] = alloca i8
+// LLVM:         %[[C_PTR:.*]] = alloca i8
+// LLVM:         %[[F_PTR:.*]] = alloca float
+// LLVM:         %[[I_PTR:.*]] = alloca i32
+// LLVM:         store i8 1, ptr %[[B_PTR]]
+// LLVM:         store i8 65, ptr %[[C_PTR]]
+// LLVM:         store float 0x40091EB860000000, ptr %[[F_PTR]]
+// LLVM:         store i32 42, ptr %[[I_PTR]]
+// LLVM:         ret void
+
+// OGCG-LABEL: define {{.*}}void @binary_assign()
+// OGCG:         %[[B_PTR:.*]] = alloca i8
+// OGCG:         %[[C_PTR:.*]] = alloca i8
+// OGCG:         %[[F_PTR:.*]] = alloca float
+// OGCG:         %[[I_PTR:.*]] = alloca i32
+// OGCG:         store i8 1, ptr %[[B_PTR]]
+// OGCG:         store i8 65, ptr %[[C_PTR]]
+// OGCG:         store float 0x40091EB860000000, ptr %[[F_PTR]]
+// OGCG:         store i32 42, ptr %[[I_PTR]]
+// OGCG:         ret void
diff --git a/clang/test/CIR/CodeGen/comma.c b/clang/test/CIR/CodeGen/comma.c
new file mode 100644
index 0000000000000..208a496c26e44
--- /dev/null
+++ b/clang/test/CIR/CodeGen/comma.c
@@ -0,0 +1,53 @@
+// RUN: %clang_cc1 -std=c23 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --input-file=%t.cir %s -check-prefix=CIR
+// RUN: %clang_cc1 -std=c23 -triple x86_64-unknown-linux-gnu -Wno-unused-value -fclangir -emit-llvm %s -o %t-cir.ll
+// RUN: FileCheck --input-file=%t-cir.ll %s -check-prefix=LLVM
+// RUN: %clang_cc1 -std=c23 -triple x86_64-unknown-linux-gnu -Wno-unused-value -emit-llvm %s -o %t.ll
+// RUN: FileCheck --input-file=%t.ll %s -check-prefix=OGCG
+
+void comma(void) {
+    bool b;
+    char c;
+    float f;
+    int i;
+
+    b = true, c = 65, f = 3.14f, i = 42;
+}
+
+// CIR-LABEL: cir.func @comma() {
+// CIR:         %[[B:.*]] = cir.alloca !cir.bool, !cir.ptr<!cir.bool>, ["b"]
+// CIR:         %[[C:.*]] = cir.alloca !s8i, !cir.ptr<!s8i>, ["c"]
+// CIR:         %[[F:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["f"]
+// CIR:         %[[I:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["i"]
+// CIR:         %[[TRUE:.*]] = cir.const #true
+// CIR:         cir.store %[[TRUE]], %[[B]] : !cir.bool, !cir.ptr<!cir.bool>
+// CIR:         %[[CHAR_INI_INIT:.*]] = cir.const #cir.int<65> : !s32i
+// CIR:         %[[CHAR_VAL:.*]] = cir.cast(integral, %[[CHAR_INI_INIT]] : !s32i), !s8i
+// CIR:         cir.store %[[CHAR_VAL]], %[[C]] : !s8i, !cir.ptr<!s8i>
+// CIR:         %[[FLOAT_VAL:.*]] = cir.const #cir.fp<3.140000e+00> : !cir.float
+// CIR:         cir.store %[[FLOAT_VAL]], %[[F]] : !cir.float, !cir.ptr<!cir.float>
+// CIR:         %[[INT_VAL:.*]] = cir.const #cir.int<42> : !s32i
+// CIR:         cir.store %[[INT_VAL]], %[[I]] : !s32i, !cir.ptr<!s32i>
+// CIR:         cir.return
+
+// LLVM-LABEL: define {{.*}}void @comma() {
+// LLVM:         %[[B_PTR:.*]] = alloca i8
+// LLVM:         %[[C_PTR:.*]] = alloca i8
+// LLVM:         %[[F_PTR:.*]] = alloca float
+// LLVM:         %[[I_PTR:.*]] = alloca i32
+// LLVM:         store i8 1, ptr %[[B_PTR]]
+// LLVM:         store i8 65, ptr %[[C_PTR]]
+// LLVM:         store float 0x40091EB860000000, ptr %[[F_PTR]]
+// LLVM:         store i32 42, ptr %[[I_PTR]]
+// LLVM:         ret void
+
+// OGCG-LABEL: define {{.*}}void @comma()
+// OGCG:         %[[B_PTR:.*]] = alloca i8
+// OGCG:         %[[C_PTR:.*]] = alloca i8
+// OGCG:         %[[F_PTR:.*]] = alloca float
+// OGCG:         %[[I_PTR:.*]] = alloca i32
+// OGCG:         store i8 1, ptr %[[B_PTR]]
+// OGCG:         store i8 65, ptr %[[C_PTR]]
+// OGCG:         store float 0x40091EB860000000, ptr %[[F_PTR]]
+// OGCG:         store i32 42, ptr %[[I_PTR]]
+// OGCG:         ret void
diff --git a/clang/test/CIR/IR/binassign.cir b/clang/test/CIR/IR/binassign.cir
new file mode 100644
index 0000000000000..24ed95d3c29c7
--- /dev/null
+++ b/clang/test/CIR/IR/binassign.cir
@@ -0,0 +1,45 @@
+// RUN: cir-opt %s | cir-opt | FileCheck %s
+
+!s32i = !cir.int<s, 32>
+!s8i = !cir.int<s, 8>
+#true = #cir.bool<true> : !cir.bool
+module {
+  cir.func @binary_assign() {
+    %0 = cir.alloca !cir.bool, !cir.ptr<!cir.bool>, ["b"] {alignment = 1 : i64}
+    %1 = cir.alloca !s8i, !cir.ptr<!s8i>, ["c"] {alignment = 1 : i64}
+    %2 = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["f"] {alignment = 4 : i64}
+    %3 = cir.alloca !s32i, !cir.ptr<!s32i>, ["i"] {alignment = 4 : i64}
+    %4 = cir.const #true
+    cir.store %4, %0 : !cir.bool, !cir.ptr<!cir.bool>
+    %5 = cir.const #cir.int<65> : !s32i
+    %6 = cir.cast(integral, %5 : !s32i), !s8i
+    cir.store %6, %1 : !s8i, !cir.ptr<!s8i>
+    %7 = cir.const #cir.fp<3.140000e+00> : !cir.float
+    cir.store %7, %2 : !cir.float, !cir.ptr<!cir.float>
+    %8 = cir.const #cir.int<42> : !s32i
+    cir.store %8, %3 : !s32i, !cir.ptr<!s32i>
+    cir.return
+  }
+}
+
+// CHECK: !s32i = !cir.int<s, 32>
+// CHECK: !s8i = !cir.int<s, 8>
+// CHECK: #true = #cir.bool<true> : !cir.bool
+// CHECK: module {
+// CHECK:   cir.func @binary_assign() {
+// CHECK:     %0 = cir.alloca !cir.bool, !cir.ptr<!cir.bool>, ["b"] {alignment = 1 : i64}
+// CHECK:     %1 = cir.alloca !s8i, !cir.ptr<!s8i>, ["c"] {alignment = 1 : i64}
+// CHECK:     %2 = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["f"] {alignment = 4 : i64}
+// CHECK:     %3 = cir.alloca !s32i, !cir.ptr<!s32i>, ["i"] {alignment = 4 : i64}
+// CHECK:     %4 = cir.const #true
+// CHECK:     cir.store %4, %0 : !cir.bool, !cir.ptr<!cir.bool>
+// CHECK:     %5 = cir.const #cir.int<65> : !s32i
+// CHECK:     %6 = cir.cast(integral, %5 : !s32i), !s8i
+// CHECK:     cir.store %6, %1 : !s8i, !cir.ptr<!s8i>
+// CHECK:     %7 = cir.const #cir.fp<3.140000e+00> : !cir.float
+// CHECK:     cir.store %7, %2 : !cir.float, !cir.ptr<!cir.float>
+// CHECK:     %8 = cir.const #cir.int<42> : !s32i
+// CHECK:     cir.store %8, %3 : !s32i, !cir.ptr<!s32i>
+// CHECK:     cir.return
+// CHECK:   }
+// CHECK: }

@mmha
Copy link
Contributor Author

mmha commented Apr 10, 2025

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little more testing on comma would be nice, else seems reasonable.

float f;
int i;

b = true, c = 65, f = 3.14f, i = 42;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this ends up being the comma operator (i Think it is?) we probably want to do some tests like:
b = 5.5, false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I did add another test for i = 100, 200; but we don't support ParenExprs, yet, so I can't really check that the result of that expression is 200 for now.


mlir::Value VisitBinComma(const BinaryOperator *e) {
cgf.emitIgnoredExpr(e->getLHS());
// NOTE: We don't need to EnsureInsertPoint() like LLVM codegen.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a difference in the way LLVM IR and MLIR are generated. You can create LLVM IR instructions in the void and insert them later. MLIR operations are always created in some region or block. The classic codegen calls EnsureInsertPoint to create a basic block to receive emitted instructions. It calls it all over the place. I'm not sure why we'd have a comment about it here. This is the only reference to EnsureInsertPoint in the incubator code.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I have a few comments, but nothing that needs to be changed immediately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd that the result is returned in via argument here. Do you know why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't and you raise a good point. OGCG uses a llvm::Value ** here as an optional return value and it makes sense there because that results in fewer dead instruction emitted if the result is discarded but for CIR that doesn't apply.

I changed the signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange that this makes Objective C seem like the normal case. I would have expected something like if (objC) { switch().... I found a couple of other places do something like if (auto ownership = FQT.getObjCLifetime()) and then a switch like this. I wonder if that would result in better performance for the non-ObjectiveC path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't know about the performance implications (I'd expect them to be insignificant) but I find your suggestion to be less readable. I still need a case Qualifiers::OCL_None: with an llvm_unreachable() to satisfy clang-tidy and it just adds more branches to the function to follow


mlir::Value VisitBinComma(const BinaryOperator *e) {
cgf.emitIgnoredExpr(e->getLHS());
// NOTE: We don't need to EnsureInsertPoint() like LLVM codegen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a difference in the way LLVM IR and MLIR are generated. You can create LLVM IR instructions in the void and insert them later. MLIR operations are always created in some region or block. The classic codegen calls EnsureInsertPoint to create a basic block to receive emitted instructions. It calls it all over the place. I'm not sure why we'd have a comment about it here. This is the only reference to EnsureInsertPoint in the incubator code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of order alphabetically. A few other things are, so we need to do a cleanup pass anyway.

mmha added 3 commits April 11, 2025 16:17
This patch adds `VisitBinAssign` and `VisitBinComma` to the ClangIR `ScalarExprEmitter` to enable assignments and the comma operator.
@github-actions
Copy link

github-actions bot commented Apr 11, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@erichkeane erichkeane merged commit 566c30e into llvm:main Apr 11, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants