Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions clang/include/clang/CIR/Dialect/IR/CIROps.td
Original file line number Diff line number Diff line change
Expand Up @@ -3415,4 +3415,81 @@ def CIR_FAbsOp : CIR_UnaryFPToFPBuiltinOp<"fabs", "FAbsOp"> {
}];
}

//===----------------------------------------------------------------------===//
// Variadic Operations
//===----------------------------------------------------------------------===//

def CIR_VAStartOp : CIR_Op<"va.start"> {
let summary = "Starts a variable argument list";
let description = [{
The cir.va.start operation models the C/C++ va_start macro by
initializing a variable argument list at the given va_list storage
location.

The operand must be a pointer to the target's `va_list` representation.
This operation has no results and produces its effect by mutating the
storage referenced by the pointer operand.

Each `cir.va.start` must be paired with a corresponding `cir.va.end`
on the same logical `va_list` object along all control-flow paths. After
`cir.va.end`, the `va_list` must not be accessed unless reinitialized
with another `cir.va.start`.

Lowering typically maps this to the LLVM intrinsic `llvm.va_start`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically? Are there cases where it doesn't?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember some issues with va_arg (where OG might not lower to the LLVM intrinsic - llvm/clangir#862), I'm not sure about va.start though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mind was at "What if we don't lower to LLVM?". I removed the "typically".

The LLVM intrinsic has a few problems because it's not a complete solution and sometimes generates worse code which is why classic codegen only uses it as a default in its ABI lowering. See here or here.

In #153834 I added lowering to llvm.va_arg as a default in LowerToLLVM. Once we get proper ABI lowering in LoweringPrepare (or at a similar stage) upstream a target can replace cir.va_arg with something else. But for now this is a good stopgap to enable things like scalar varargs on most platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me

passing the appropriately decayed pointer to the underlying `va_list`
storage.

Example:

```mlir
// %args : !cir.ptr<!cir.array<!rec___va_list_tag x 1>>
%p = cir.cast(array_to_ptrdecay, %args
: !cir.ptr<!cir.array<!rec___va_list_tag x 1>>),
!cir.ptr<!rec___va_list_tag>
cir.va.start %p : !cir.ptr<!rec___va_list_tag>
```
}];
let arguments = (ins CIR_PointerType:$arg_list);
Copy link
Contributor Author

@mmha mmha Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcardosolopes I assumecir.va.start drops the length info from the builtin because the LLVM intrinsic does, but I think it would be useful to retain it in CIR for static code analysis purposes. What's your opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm down for retaining it!


let assemblyFormat = [{
$arg_list attr-dict `:` type(operands)
}];
}

def CIR_VAEndOp : CIR_Op<"va.end"> {
let summary = "Ends a variable argument list";
let description = [{
The `cir.va.end` operation models the C/C++ va_end macro by finalizing
and cleaning up a variable argument list previously initialized with
`cir.va.start`.

The operand must be a pointer to the target's `va_list` representation.
This operation has no results and produces its effect by mutating the
storage referenced by the pointer operand.

`cir.va.end` must only be called after a matching `cir.va.start` on the
same `va_list` along all control-flow paths. After `cir.va.end`, the
`va_list` is invalid and must not be accessed unless reinitialized.

Lowering typically maps this to the LLVM intrinsic `llvm.va_end`,
passing the appropriately decayed pointer to the underlying `va_list`
storage.

Example:
```mlir
// %args : !cir.ptr<!cir.array<!rec___va_list_tag x 1>>
%p = cir.cast(array_to_ptrdecay, %args
: !cir.ptr<!cir.array<!rec___va_list_tag x 1>>),
!cir.ptr<!rec___va_list_tag>
cir.va.end %p : !cir.ptr<!rec___va_list_tag>
```
}];

let arguments = (ins CIR_PointerType:$arg_list);

let assemblyFormat = [{
$arg_list attr-dict `:` type(operands)
}];
}

#endif // CLANG_CIR_DIALECT_IR_CIROPS_TD
21 changes: 21 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,18 @@ RValue CIRGenFunction::emitBuiltinExpr(const GlobalDecl &gd, unsigned builtinID,
default:
break;

// C stdarg builtins.
case Builtin::BI__builtin_stdarg_start:
case Builtin::BI__builtin_va_start:
case Builtin::BI__va_start:
case Builtin::BI__builtin_va_end: {
emitVAStartEnd(builtinID == Builtin::BI__va_start
? emitScalarExpr(e->getArg(0))
: emitVAListRef(e->getArg(0)).getPointer(),
builtinID != Builtin::BI__builtin_va_end);
return {};
}

case Builtin::BIfabs:
case Builtin::BIfabsf:
case Builtin::BIfabsl:
Expand Down Expand Up @@ -361,3 +373,12 @@ mlir::Value CIRGenFunction::emitCheckedArgForAssume(const Expr *e) {
"emitCheckedArgForAssume: sanitizers are NYI");
return {};
}

void CIRGenFunction::emitVAStartEnd(mlir::Value argValue, bool isStart) {
// LLVM codegen casts to *i8, no real gain on doing this for CIRGen this
// early, defer to LLVM lowering.
if (isStart)
cir::VAStartOp::create(builder, argValue.getLoc(), argValue);
else
cir::VAEndOp::create(builder, argValue.getLoc(), argValue);
}
7 changes: 2 additions & 5 deletions clang/lib/CIR/CodeGen/CIRGenExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,8 @@ Address CIRGenFunction::emitPointerWithAlignment(const Expr *expr,
} break;

// Array-to-pointer decay. TODO(cir): BaseInfo and TBAAInfo.
case CK_ArrayToPointerDecay: {
cgm.errorNYI(expr->getSourceRange(),
"emitPointerWithAlignment: array-to-pointer decay");
return Address::invalid();
}
case CK_ArrayToPointerDecay:
return emitArrayToPointerDecay(ce->getSubExpr());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the baseinfo argument and insert a missing feature asserts here for tbaa?


case CK_UncheckedDerivedToBase:
case CK_DerivedToBase: {
Expand Down
6 changes: 6 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1080,4 +1080,10 @@ void CIRGenFunction::emitVariablyModifiedType(QualType type) {
} while (type->isVariablyModifiedType());
}

Address CIRGenFunction::emitVAListRef(const Expr *e) {
if (getContext().getBuiltinVaListType()->isArrayType())
return emitPointerWithAlignment(e);
return emitLValue(e).getAddress();
}

} // namespace clang::CIRGen
13 changes: 13 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -1411,6 +1411,19 @@ class CIRGenFunction : public CIRGenTypeCache {
const clang::Stmt *thenS,
const clang::Stmt *elseS);

/// Build a "reference" to a va_list; this is either the address or the value
/// of the expression, depending on how va_list is defined.
Address emitVAListRef(const Expr *e);

/// Emits a CIR variable-argument operation, either
/// \c cir.va.start or \c cir.va.end.
///
/// \param argValue A reference to the \c va_list as emitted by either
/// \c emitVAListRef or \c emitMSVAListRef.
///
/// \param isStart If \c true, emits \c cir.va.start, otherwise \c cir.va.end.
void emitVAStartEnd(mlir::Value argValue, bool isStart);

/// ----------------------
/// CIR build helpers
/// -----------------
Expand Down
22 changes: 22 additions & 0 deletions clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2336,6 +2336,8 @@ void ConvertCIRToLLVMPass::runOnOperation() {
CIRToLLVMTrapOpLowering,
CIRToLLVMUnaryOpLowering,
CIRToLLVMUnreachableOpLowering,
CIRToLLVMVAEndOpLowering,
CIRToLLVMVAStartOpLowering,
CIRToLLVMVecCmpOpLowering,
CIRToLLVMVecCreateOpLowering,
CIRToLLVMVecExtractOpLowering,
Expand Down Expand Up @@ -3035,6 +3037,26 @@ mlir::LogicalResult CIRToLLVMInlineAsmOpLowering::matchAndRewrite(
return mlir::success();
}

mlir::LogicalResult CIRToLLVMVAStartOpLowering::matchAndRewrite(
cir::VAStartOp op, OpAdaptor adaptor,
mlir::ConversionPatternRewriter &rewriter) const {
auto opaquePtr = mlir::LLVM::LLVMPointerType::get(getContext());
auto vaList = mlir::LLVM::BitcastOp::create(rewriter, op.getLoc(), opaquePtr,
adaptor.getArgList());
rewriter.replaceOpWithNewOp<mlir::LLVM::VaStartOp>(op, vaList);
return mlir::success();
}

mlir::LogicalResult CIRToLLVMVAEndOpLowering::matchAndRewrite(
cir::VAEndOp op, OpAdaptor adaptor,
mlir::ConversionPatternRewriter &rewriter) const {
auto opaquePtr = mlir::LLVM::LLVMPointerType::get(getContext());
auto vaList = mlir::LLVM::BitcastOp::create(rewriter, op.getLoc(), opaquePtr,
adaptor.getArgList());
rewriter.replaceOpWithNewOp<mlir::LLVM::VaEndOp>(op, vaList);
return mlir::success();
}

std::unique_ptr<mlir::Pass> createConvertCIRToLLVMPass() {
return std::make_unique<ConvertCIRToLLVMPass>();
}
Expand Down
20 changes: 20 additions & 0 deletions clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.h
Original file line number Diff line number Diff line change
Expand Up @@ -684,6 +684,26 @@ class CIRToLLVMInlineAsmOpLowering
mlir::ConversionPatternRewriter &) const override;
};

class CIRToLLVMVAStartOpLowering
: public mlir::OpConversionPattern<cir::VAStartOp> {
public:
using mlir::OpConversionPattern<cir::VAStartOp>::OpConversionPattern;

mlir::LogicalResult
matchAndRewrite(cir::VAStartOp op, OpAdaptor,
mlir::ConversionPatternRewriter &) const override;
};

class CIRToLLVMVAEndOpLowering
: public mlir::OpConversionPattern<cir::VAEndOp> {
public:
using mlir::OpConversionPattern<cir::VAEndOp>::OpConversionPattern;

mlir::LogicalResult
matchAndRewrite(cir::VAEndOp op, OpAdaptor,
mlir::ConversionPatternRewriter &) const override;
};

} // namespace direct
} // namespace cir

Expand Down
48 changes: 48 additions & 0 deletions clang/test/CIR/CodeGen/var_arg.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -Wno-unused-value -fclangir -emit-cir %s -o %t.cir
// RUN: FileCheck --input-file=%t.cir %s -check-prefix=CIR
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -Wno-unused-value -fclangir -emit-llvm %s -o %t-cir.ll
// RUN: FileCheck --input-file=%t-cir.ll %s -check-prefix=LLVM
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -Wno-unused-value -emit-llvm %s -o %t.ll
// RUN: FileCheck --input-file=%t.ll %s -check-prefix=OGCG

void varargs(int count, ...) {
__builtin_va_list args;
__builtin_va_start(args, 12345);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add test cases that use Builtin::BI__builtin_stdarg_start and Builtin::BI__va_start? (Assuming this one corresponds to Builtin::BI__builtin_va_start)

__builtin_va_end(args);
}

// CIR: !rec___va_list_tag = !cir.record<struct "__va_list_tag" {!u32i, !u32i, !cir.ptr<!void>, !cir.ptr<!void>}

// CIR: cir.func dso_local @varargs(%[[COUNT_ARG:.+]]: !s32i {{.*}}, ...) {{.*}}
// CIR: %[[COUNT:.+]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["count", init]
// CIR: %[[ARGS:.+]] = cir.alloca !cir.array<!rec___va_list_tag x 1>, !cir.ptr<!cir.array<!rec___va_list_tag x 1>>, ["args"]
// CIR: cir.store %[[COUNT_ARG]], %[[COUNT]] : !s32i, !cir.ptr<!s32i>
// CIR: %[[ARGS_DECAY1:.+]] = cir.cast(array_to_ptrdecay, %[[ARGS]] : !cir.ptr<!cir.array<!rec___va_list_tag x 1>>), !cir.ptr<!rec___va_list_tag>
// CIR: cir.va.start %[[ARGS_DECAY1]] : !cir.ptr<!rec___va_list_tag>
// CIR: %[[ARGS_DECAY2:.+]] = cir.cast(array_to_ptrdecay, %[[ARGS]] : !cir.ptr<!cir.array<!rec___va_list_tag x 1>>), !cir.ptr<!rec___va_list_tag>
// CIR: cir.va.end %[[ARGS_DECAY2]] : !cir.ptr<!rec___va_list_tag>
// CIR: cir.return

// LLVM: %struct.__va_list_tag = type { i32, i32, ptr, ptr }

// LLVM: define dso_local void @varargs(i32 %[[ARG0:.+]], ...)
// LLVM: %[[COUNT_ADDR:.+]] = alloca i32, i64 1
// LLVM: %[[ARGS:.+]] = alloca [1 x %struct.__va_list_tag], i64 1
// LLVM: store i32 %[[ARG0]], ptr %[[COUNT_ADDR]]
// LLVM: %[[GEP1:.+]] = getelementptr %struct.__va_list_tag, ptr %[[ARGS]], i32 0
// LLVM: call void @llvm.va_start.p0(ptr %[[GEP1]])
// LLVM: %[[GEP2:.+]] = getelementptr %struct.__va_list_tag, ptr %[[ARGS]], i32 0
// LLVM: call void @llvm.va_end.p0(ptr %[[GEP2]])
// LLVM: ret void

// OGCG: %struct.__va_list_tag = type { i32, i32, ptr, ptr }

// OGCG: define dso_local void @varargs(i32 noundef %[[COUNT:.+]], ...)
// OGCG: %[[COUNT_ADDR:.+]] = alloca i32
// OGCG: %[[ARGS:.+]] = alloca [1 x %struct.__va_list_tag]
// OGCG: store i32 %[[COUNT]], ptr %[[COUNT_ADDR]]
// OGCG: %[[ARRDECAY1:.+]] = getelementptr inbounds [1 x %struct.__va_list_tag], ptr %[[ARGS]], i64 0, i64 0
// OGCG: call void @llvm.va_start.p0(ptr %[[ARRDECAY1]])
// OGCG: %[[ARRDECAY2:.+]] = getelementptr inbounds [1 x %struct.__va_list_tag], ptr %[[ARGS]], i64 0, i64 0
// OGCG: call void @llvm.va_end.p0(ptr %[[ARRDECAY2]])
// OGCG: ret void