Skip to content

Conversation

@Andres-Salamanca
Copy link
Contributor

This PR fixes the access to bitfields inside a union.
Previously, we were using a getMemberOp to access the field, but because it is a union, getMemberOp would always use index 0.

For example, given:

typedef union {
  int x;
  int y : 4;
  int z : 8;
} demo;
!rec_demo = !cir.record<union "demo" {!s32i, !u8i, !u8i}>

In the case of:

d.y = 2;

It would generate:

cir.get_member %0[0] {name = "y"} : !cir.ptr<!rec_demo> -> !cir.ptr<!s32i>

with a return type of !s32i, when it should be !u8i.
the get_member verifier would detect that the return type does not match the y member.
To fix this, we now use bitcast to get the start of the union.

@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Aug 19, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 19, 2025

@llvm/pr-subscribers-clang

Author: None (Andres-Salamanca)

Changes

This PR fixes the access to bitfields inside a union.
Previously, we were using a getMemberOp to access the field, but because it is a union, getMemberOp would always use index 0.

For example, given:

typedef union {
  int x;
  int y : 4;
  int z : 8;
} demo;
!rec_demo = !cir.record&lt;union "demo" {!s32i, !u8i, !u8i}&gt;

In the case of:

d.y = 2;

It would generate:

cir.get_member %0[0] {name = "y"} : !cir.ptr&lt;!rec_demo&gt; -&gt; !cir.ptr&lt;!s32i&gt;

with a return type of !s32i, when it should be !u8i.
the get_member verifier would detect that the return type does not match the y member.
To fix this, we now use bitcast to get the start of the union.


Full diff: https://github.com/llvm/llvm-project/pull/154398.diff

2 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+3-1)
  • (modified) clang/test/CIR/CodeGen/bitfield-union.c (+41)
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index 8bcca6f5d1803..16cf71cf4d56e 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -372,9 +372,11 @@ Address CIRGenFunction::getAddrOfBitFieldStorage(LValue base,
                                                  unsigned index) {
   mlir::Location loc = getLoc(field->getLocation());
   cir::PointerType fieldPtr = cir::PointerType::get(fieldType);
+  auto rec = cast<cir::RecordType>(base.getAddress().getElementType());
+  if (index == 0 && rec.isUnion())
+    return base.getAddress();
   cir::GetMemberOp sea = getBuilder().createGetMember(
       loc, fieldPtr, base.getPointer(), field->getName(), index);
-  auto rec = cast<cir::RecordType>(base.getAddress().getElementType());
   CharUnits offset = CharUnits::fromQuantity(
       rec.getElementOffset(cgm.getDataLayout().layout, index));
   return Address(sea, base.getAlignment().alignmentAtOffset(offset));
diff --git a/clang/test/CIR/CodeGen/bitfield-union.c b/clang/test/CIR/CodeGen/bitfield-union.c
index b5d14540f488f..4928be410479c 100644
--- a/clang/test/CIR/CodeGen/bitfield-union.c
+++ b/clang/test/CIR/CodeGen/bitfield-union.c
@@ -28,3 +28,44 @@ typedef union {
 
 demo d;
 zero_bit z;
+
+int main() {
+    demo d;
+    d.x = 1;
+    d.y = 2;
+    d.z = 0;
+}
+
+// CIR: #bfi_y = #cir.bitfield_info<name = "y", storage_type = !u8i, size = 4, offset = 0, is_signed = true>
+// CIR: #bfi_z = #cir.bitfield_info<name = "z", storage_type = !u8i, size = 8, offset = 0, is_signed = true>
+
+// CIR:   cir.func no_proto dso_local @main
+// CIR:    [[ALLOC:%.*]] = cir.alloca !rec_demo, !cir.ptr<!rec_demo>, ["d"] {alignment = 4 : i64}
+// CIR:    [[ONE:%.*]] = cir.const #cir.int<1> : !s32i
+// CIR:    [[X:%.*]] = cir.get_member [[ALLOC]][0] {name = "x"} : !cir.ptr<!rec_demo> -> !cir.ptr<!s32i>
+// CIR:    cir.store align(4) [[ONE]], [[X]] : !s32i, !cir.ptr<!s32i>
+// CIR:    [[TWO:%.*]] = cir.const #cir.int<2> : !s32i
+// CIR:    [[BITCAST:%.*]] = cir.cast(bitcast, [[ALLOC]] : !cir.ptr<!rec_demo>), !cir.ptr<!u8i>
+// CIR:    [[SET:%.*]] = cir.set_bitfield align(4) (#bfi_y, [[BITCAST]] : !cir.ptr<!u8i>, [[TWO]] : !s32i) -> !s32i
+// CIR:    [[ZERO:%.*]] = cir.const #cir.int<0> : !s32i
+// CIR:    [[BITCAST2:%.*]] = cir.cast(bitcast, [[ALLOC]] : !cir.ptr<!rec_demo>), !cir.ptr<!u8i>
+// CIR:    [[SET2:%.*]] = cir.set_bitfield align(4) (#bfi_z, [[BITCAST2]] : !cir.ptr<!u8i>, [[ZERO]] : !s32i) -> !s32i
+// CIR:    cir.return
+
+// LLVM: define dso_local i32 @main
+// LLVM:   [[ALLOC:%.*]] = alloca %union.demo, i64 1, align 4
+// LLVM:   store i32 1, ptr [[ALLOC]], align 4
+// LLVM:   [[BFLOAD:%.*]] = load i8, ptr [[ALLOC]], align 4
+// LLVM:   [[CLEAR:%.*]] = and i8 [[BFLOAD]], -16
+// LLVM:   [[SET:%.*]] = or i8 [[CLEAR]], 2
+// LLVM:   store i8 [[SET]], ptr [[ALLOC]], align 4
+// LLVM:   store i8 0, ptr [[ALLOC]], align 4
+
+// OGCG: define dso_local i32 @main
+// OGCG:   [[ALLOC:%.*]] = alloca %union.demo, align 4
+// OGCG:   store i32 1, ptr [[ALLOC]], align 4
+// OGCG:   [[BFLOAD:%.*]] = load i8, ptr [[ALLOC]], align 4
+// OGCG:   [[CLEAR:%.*]] = and i8 [[BFLOAD]], -16
+// OGCG:   [[SET:%.*]] = or i8 [[CLEAR]], 2
+// OGCG:   store i8 [[SET]], ptr [[ALLOC]], align 4
+// OGCG:   store i8 0, ptr [[ALLOC]], align 4

@llvmbot
Copy link
Member

llvmbot commented Aug 19, 2025

@llvm/pr-subscribers-clangir

Author: None (Andres-Salamanca)

Changes

This PR fixes the access to bitfields inside a union.
Previously, we were using a getMemberOp to access the field, but because it is a union, getMemberOp would always use index 0.

For example, given:

typedef union {
  int x;
  int y : 4;
  int z : 8;
} demo;
!rec_demo = !cir.record&lt;union "demo" {!s32i, !u8i, !u8i}&gt;

In the case of:

d.y = 2;

It would generate:

cir.get_member %0[0] {name = "y"} : !cir.ptr&lt;!rec_demo&gt; -&gt; !cir.ptr&lt;!s32i&gt;

with a return type of !s32i, when it should be !u8i.
the get_member verifier would detect that the return type does not match the y member.
To fix this, we now use bitcast to get the start of the union.


Full diff: https://github.com/llvm/llvm-project/pull/154398.diff

2 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+3-1)
  • (modified) clang/test/CIR/CodeGen/bitfield-union.c (+41)
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index 8bcca6f5d1803..16cf71cf4d56e 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -372,9 +372,11 @@ Address CIRGenFunction::getAddrOfBitFieldStorage(LValue base,
                                                  unsigned index) {
   mlir::Location loc = getLoc(field->getLocation());
   cir::PointerType fieldPtr = cir::PointerType::get(fieldType);
+  auto rec = cast<cir::RecordType>(base.getAddress().getElementType());
+  if (index == 0 && rec.isUnion())
+    return base.getAddress();
   cir::GetMemberOp sea = getBuilder().createGetMember(
       loc, fieldPtr, base.getPointer(), field->getName(), index);
-  auto rec = cast<cir::RecordType>(base.getAddress().getElementType());
   CharUnits offset = CharUnits::fromQuantity(
       rec.getElementOffset(cgm.getDataLayout().layout, index));
   return Address(sea, base.getAlignment().alignmentAtOffset(offset));
diff --git a/clang/test/CIR/CodeGen/bitfield-union.c b/clang/test/CIR/CodeGen/bitfield-union.c
index b5d14540f488f..4928be410479c 100644
--- a/clang/test/CIR/CodeGen/bitfield-union.c
+++ b/clang/test/CIR/CodeGen/bitfield-union.c
@@ -28,3 +28,44 @@ typedef union {
 
 demo d;
 zero_bit z;
+
+int main() {
+    demo d;
+    d.x = 1;
+    d.y = 2;
+    d.z = 0;
+}
+
+// CIR: #bfi_y = #cir.bitfield_info<name = "y", storage_type = !u8i, size = 4, offset = 0, is_signed = true>
+// CIR: #bfi_z = #cir.bitfield_info<name = "z", storage_type = !u8i, size = 8, offset = 0, is_signed = true>
+
+// CIR:   cir.func no_proto dso_local @main
+// CIR:    [[ALLOC:%.*]] = cir.alloca !rec_demo, !cir.ptr<!rec_demo>, ["d"] {alignment = 4 : i64}
+// CIR:    [[ONE:%.*]] = cir.const #cir.int<1> : !s32i
+// CIR:    [[X:%.*]] = cir.get_member [[ALLOC]][0] {name = "x"} : !cir.ptr<!rec_demo> -> !cir.ptr<!s32i>
+// CIR:    cir.store align(4) [[ONE]], [[X]] : !s32i, !cir.ptr<!s32i>
+// CIR:    [[TWO:%.*]] = cir.const #cir.int<2> : !s32i
+// CIR:    [[BITCAST:%.*]] = cir.cast(bitcast, [[ALLOC]] : !cir.ptr<!rec_demo>), !cir.ptr<!u8i>
+// CIR:    [[SET:%.*]] = cir.set_bitfield align(4) (#bfi_y, [[BITCAST]] : !cir.ptr<!u8i>, [[TWO]] : !s32i) -> !s32i
+// CIR:    [[ZERO:%.*]] = cir.const #cir.int<0> : !s32i
+// CIR:    [[BITCAST2:%.*]] = cir.cast(bitcast, [[ALLOC]] : !cir.ptr<!rec_demo>), !cir.ptr<!u8i>
+// CIR:    [[SET2:%.*]] = cir.set_bitfield align(4) (#bfi_z, [[BITCAST2]] : !cir.ptr<!u8i>, [[ZERO]] : !s32i) -> !s32i
+// CIR:    cir.return
+
+// LLVM: define dso_local i32 @main
+// LLVM:   [[ALLOC:%.*]] = alloca %union.demo, i64 1, align 4
+// LLVM:   store i32 1, ptr [[ALLOC]], align 4
+// LLVM:   [[BFLOAD:%.*]] = load i8, ptr [[ALLOC]], align 4
+// LLVM:   [[CLEAR:%.*]] = and i8 [[BFLOAD]], -16
+// LLVM:   [[SET:%.*]] = or i8 [[CLEAR]], 2
+// LLVM:   store i8 [[SET]], ptr [[ALLOC]], align 4
+// LLVM:   store i8 0, ptr [[ALLOC]], align 4
+
+// OGCG: define dso_local i32 @main
+// OGCG:   [[ALLOC:%.*]] = alloca %union.demo, align 4
+// OGCG:   store i32 1, ptr [[ALLOC]], align 4
+// OGCG:   [[BFLOAD:%.*]] = load i8, ptr [[ALLOC]], align 4
+// OGCG:   [[CLEAR:%.*]] = and i8 [[BFLOAD]], -16
+// OGCG:   [[SET:%.*]] = or i8 [[CLEAR]], 2
+// OGCG:   store i8 [[SET]], ptr [[ALLOC]], align 4
+// OGCG:   store i8 0, ptr [[ALLOC]], align 4

mlir::Location loc = getLoc(field->getLocation());
cir::PointerType fieldPtr = cir::PointerType::get(fieldType);
auto rec = cast<cir::RecordType>(base.getAddress().getElementType());
if (index == 0 && rec.isUnion())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is index ever non-zero for unions? I see that the incubator only checks if (index == 0) here and returns base.getAddress() in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the incubator we handled it that way, but in this discussion we decided to remove that:
#145971 (comment)
As for “Is index ever non-zero for unions?”, the answer is no we always get zero for bitfields in unions, because of here:

unsigned idx = layout.getCIRFieldNo(field);
Address addr = getAddrOfBitFieldStorage(base, field, info.storageType, idx);

// This maps a field to its index. For unions, the index is always 0.
fieldIdxMap[field->getCanonicalDecl()] = 0;

One thing to note is that when loading a non-bitfield, it does receive an index different from 0:
https://godbolt.org/z/vMTc74Go1

That index comes from here:

if (rec->isUnion())
fieldIndex = field->getFieldIndex();
else {

But in the end, we’re still using index 0 when calculating the offset:

unsigned idx = layout.getCIRFieldNo(field);
CharUnits offset = CharUnits::fromQuantity(
layout.getCIRType().getElementOffset(cgm.getDataLayout().layout, idx));
return Address(memberAddr, base.getAlignment().alignmentAtOffset(offset));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-      loc, fieldPtr, base.getPointer(), field->getName(), index);
+      loc, fieldPtr, base.getPointer(), field->getName(), rec.isUnion() ? field->getFieldIndex() : index);

I changed it to use field->getFieldIndex(), similar to how CIRGenFunction::emitAddrOfFieldStorage does.

demo d;
zero_bit z;

int main() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int main() {
void f() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I like the new changes. I still wouldn't name the function 'main' in the test though.

@Andres-Salamanca Andres-Salamanca merged commit 577ee97 into llvm:main Aug 19, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants