Skip to content
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions clang/lib/Basic/Targets/SystemZ.h
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,8 @@ class LLVM_LIBRARY_VISIBILITY SystemZTargetInfo : public TargetInfo {
return RegNo < 4 ? 6 + RegNo : -1;
}

bool hasSjLjLowering() const override { return true; }

std::pair<unsigned, unsigned> hardwareInterferenceSizes() const override {
return std::make_pair(256, 256);
}
Expand Down
25 changes: 25 additions & 0 deletions clang/lib/CodeGen/CGBuiltin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4619,6 +4619,31 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
// Buffer is a void**.
Address Buf = EmitPointerWithAlignment(E->getArg(0));

if (getTarget().getTriple().getArch() == llvm::Triple::systemz) {
// Call LLVM's EH setjmp, which is lightweight.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment here should explain that we're not filling any fields of the jmp_buf here and leave it all to the back-end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned about modifying the semantics of the intrinsic like this. I mean, the semantics are inherently a bit platform-specific, but this contradicts the documentation in llvm/docs/ExceptionHandling.rst.

If we need different semantics here for some reason, I'd prefer to use an intrinsic with a different name, to avoid any confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying problem is that current implementation and documentation of the builtin is somewhat inconsistent with itself. The documentation starts out with:

The buffer format and the overall functioning of this intrinsic is compatible with the GCC __builtin_setjmp implementation allowing code built with the clang and GCC to interoperate.

But in GCC the format of the buffer is completely target-dependent (and actually materially different between targets), while LLVM attempts to impose some constraints on certain fields. (These happen to be OK for the small subset of targets that currently implement these intrinsics in LLVM, but not all other GCC targets.)

Specifically, the LLVM documentation continues with

The front end places the frame pointer in the first word

Now, even in GCC every target places the frame pointer in the first word, but in GCC this means the contents of the target frame pointer register, if any is used. What clang puts into the first word, however, is the result of the frameaddress intrinsic - this value matches the frame pointer register contents on some but not all targets. On SystemZ these values differ, which would cause incompatibilties with GCC.

Finally, the clang front-end also writes the stack pointer into the third slot. Not only is this not even mentioned in the docs, it also does not match GCC behavior on all targets. While indeed all targets have the to save the stack pointer somewhere, they're not all using the third slot. On SystemZ we use the fourth slot.

It is a bit unclear to me what the purpose of writing those two slots in the front end is in the first place. It would seem more straightforward to just leave the writing of the buffer completely to the target back-end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That's pretty messy. See also 02269a6 and faa3abb . I think what happened is that the meaning of the intrinsic was adapted to suit sjlj lowering, and nobody really thought about trying to expose a stable interface to frontends.

I still think it's a bit of a hazard to adjust the semantics of the intrinsic at this point without changing the name. But there isn't any existing SystemZ bitcode to break, so it might be enough to just update the documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It seems back in the day the eh.sjlj.setjmp intrinsic was used by both the front-end to implement __builtin_setjmp and by the middle-end (SjLjEHPrepare.cpp) for exception handling. Since these two users apparently require somewhat different information in the buffer, filling this in was left to the user.

However, these days those two users actually emit two different intrinsics. The middle-end now uses eh.sjlj.setup.dispatch (which is apparently only intended for internal usage; it's not documented at all), while only the front-end continues to use eh.sjlj.setjmp. See http://reviews.llvm.org/D9313.

I still think the most straightfoward way would be to have eh.sjlj.setjmp just be completely implemented in the back end, and simply do exactly what is needed to implement __builtin_setjmp on the platform, just like is done with longjmp. This could be done for all targets (probably the cleanest solution, but would require updating the existing targets and might introduce incompatibilities with existing bytecode on those targets), or just for selected targets (I guess this would then need to be documented? And should it be a positive or negative list of targets?).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I don't think there's any benefit to having the frontend to generate explicit stores. I just want to make sure whatever solution we choose, it doesn't silently break code generated by other frontends (or older versions of clang).

// We're not filling any fields of the jmp_buf here and leave
// it all to the back-end.
// Current LLVM implementation and documentation of the builtin is
// somewhat inconsistent with itself. The documentation starts out with:
// The buffer format and the overall functioning of this intrinsic is
// compatible with the GCC __builtin_setjmp implementation allowing code
// built with the clang and GCC to interoperate.
// But in GCC the format of the buffer is completely target-dependent,
// while LLVM attempts to impose some constraints on certain fields.
// 1. LLVM documentation continues with - The front end places the frame
// pointer in the first word - What clang puts into the first word,
// however, is the result of the frameaddress intrinsic - this value
// matches the frame pointer register contents on some but not all
// targets. On SystemZ these values differ, which would cause
// incompatibilties with GCC.
// 2. Clang front-end also writes the stack pointer into the third
// slot. Not only is this not even mentioned in the docs, it also does
// not match GCC behavior on all targets. On SystemZ we use the fourth
// slot.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anoopkg6 , we shouldn't add a long comment here why the documentation is wrong - instead, the patch should actually fix the documentation. This particular bit would be in docs/ExceptionHandling.rst. I would change the paragraph that currently reads:

The single parameter is a pointer to a five word buffer in which the calling
context is saved. The front end places the frame pointer in the first word, and
the target implementation of this intrinsic should place the destination address
for a `llvm.eh.sjlj.longjmp`_ in the second word. The following three words are
available for use in a target-specific manner.

to something like:

The single parameter is a pointer to a five word buffer in which the calling
context is saved.   The format and contents of the buffer are target-specific.
On certain targets, the front end places the frame pointer in the first word
and the stack pointer in the third word, while the target implementation of
this intrinsic fills in the remaining words.  On other targets, saving the
calling context to the buffer is left completely to the target implementation.

Then can we simply put a short notice in this file, along the lines of
"On this target, the back end fills in the context buffer completely."

@efriedma-quic would this be OK with you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specifically call out which targets are which, for reference. No point in being overly generic. Otherwise I guess that's fine.

It's worth pointing out that if the backend completely fills the buffer, it doesn't really matter if the frontend stores to the buffer before calling setjmp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specifically call out which targets are which, for reference. No point in being overly generic.

Fair enough.

It's worth pointing out that if the backend completely fills the buffer, it doesn't really matter if the frontend stores to the buffer before calling setjmp.

Understood. It would still seem cleaner to not emit those stores if the back-end is going to overwrite them anyway ...

Function *F = CGM.getIntrinsic(Intrinsic::eh_sjlj_setjmp);
return RValue::get(Builder.CreateCall(F, Buf.emitRawPointer(*this)));
}

// Store the frame pointer to the setjmp buffer.
Value *FrameAddr = Builder.CreateCall(
CGM.getIntrinsic(Intrinsic::frameaddress, AllocaInt8PtrTy),
Expand Down
25 changes: 25 additions & 0 deletions clang/test/CodeGen/SystemZ/builtin-setjmp-logjmp-backchain.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a -backchain version of the test case here - what's tested here is completely identical as far as the front-end goes. Just the other test should be fine.

// RUN: %clang -mbackchain --target=s390x-linux -S -emit-llvm -o - %s | FileCheck %s

void *buf[20];
// CHECK-LABEL: define dso_local void @foo(
// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[TMP0:%.*]] = call i32 @llvm.eh.sjlj.setjmp(ptr @buf)
// CHECK-NEXT: ret void
//
void foo()
{
__builtin_setjmp (buf);
}

// CHECK-LABEL: define dso_local void @foo1(
// CHECK-SAME: ) #[[ATTR0]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: call void @llvm.eh.sjlj.longjmp(ptr @buf)
// CHECK-NEXT: unreachable
//
void foo1()
{
__builtin_longjmp (buf, 1);
}
25 changes: 25 additions & 0 deletions clang/test/CodeGen/SystemZ/builtin-setjmp-logjmp.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
// RUN: %clang --target=s390x-linux -S -emit-llvm -o - %s | FileCheck %s

void *buf[20];
// CHECK-LABEL: define dso_local void @foo(
// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[TMP0:%.*]] = call i32 @llvm.eh.sjlj.setjmp(ptr @buf)
// CHECK-NEXT: ret void
//
void foo()
{
__builtin_setjmp (buf);
}

// CHECK-LABEL: define dso_local void @foo1(
// CHECK-SAME: ) #[[ATTR0]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: call void @llvm.eh.sjlj.longjmp(ptr @buf)
// CHECK-NEXT: unreachable
//
void foo1()
{
__builtin_longjmp (buf, 1);
}
5 changes: 5 additions & 0 deletions llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -688,6 +688,11 @@ void SystemZAsmPrinter::emitInstruction(const MachineInstr *MI) {
return;
}

// EH_SjLj_Setup is a dummy terminator instruction of size 0.
// It is used to handle the clobber register for builtin setjmp.
case SystemZ::EH_SjLj_Setup:
return;

default:
Lower.lower(MI, LoweredMI);
break;
Expand Down
251 changes: 251 additions & 0 deletions llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -751,6 +751,11 @@ SystemZTargetLowering::SystemZTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom);
setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);

// We're not using SJLJ for exception handling, but they're implemented
// solely to support use of __builtin_setjmp / __builtin_longjmp.
setOperationAction(ISD::EH_SJLJ_SETJMP, MVT::i32, Custom);
setOperationAction(ISD::EH_SJLJ_LONGJMP, MVT::Other, Custom);

// We want to use MVC in preference to even a single load/store pair.
MaxStoresPerMemcpy = Subtarget.hasVector() ? 2 : 0;
MaxStoresPerMemcpyOptSize = 0;
Expand Down Expand Up @@ -940,6 +945,240 @@ bool SystemZTargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT,
return SystemZVectorConstantInfo(Imm).isVectorConstantLegal(Subtarget);
}

MachineBasicBlock *
SystemZTargetLowering::emitEHSjLjSetJmp(MachineInstr &MI,
MachineBasicBlock *MBB) const {
DebugLoc DL = MI.getDebugLoc();
const TargetInstrInfo *TII = Subtarget.getInstrInfo();
const SystemZRegisterInfo *TRI = Subtarget.getRegisterInfo();

MachineFunction *MF = MBB->getParent();
MachineRegisterInfo &MRI = MF->getRegInfo();

const BasicBlock *BB = MBB->getBasicBlock();
MachineFunction::iterator I = ++MBB->getIterator();

Register DstReg = MI.getOperand(0).getReg();
const TargetRegisterClass *RC = MRI.getRegClass(DstReg);
assert(TRI->isTypeLegalForClass(*RC, MVT::i32) && "Invalid destination!");
Register mainDstReg = MRI.createVirtualRegister(RC);
Register restoreDstReg = MRI.createVirtualRegister(RC);

MVT PVT = getPointerTy(MF->getDataLayout());
assert((PVT == MVT::i64 || PVT == MVT::i32) && "Invalid Pointer Size!");
// For v = setjmp(buf), we generate.
// Algorithm:
//
// ---------
// | thisMBB |
// ---------
// |
// ------------------------
// | |
// ---------- ---------------
// | mainMBB | | restoreMBB |
// | v = 0 | | v = 1 |
// ---------- ---------------
// | |
// -------------------------
// |
// -----------------------------
// | sinkMBB |
// | phi(v_mainMBB,v_restoreMBB) |
// -----------------------------
// thisMBB:
// buf[0] = Frame Pointer if hasFP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use FPOffset here to be consistent with the code below.

// buf[LabelOffset] = restoreMBB <-- takes address of restoreMBB.
// buf[BCOffset] = Backchain value if building with -mbackchain.
// buf[SPOffset] = Stack Pointer.
// buf[LPOffset] = We never write this slot with R13, gcc stores R13 always.
// SjLjSetup restoreMBB
// mainMBB:
// v_main = 0
// sinkMBB:
// v = phi(v_main, v_restore)
// restoreMBB:
// v_restore = 1

MachineBasicBlock *thisMBB = MBB;
MachineBasicBlock *mainMBB = MF->CreateMachineBasicBlock(BB);
MachineBasicBlock *sinkMBB = MF->CreateMachineBasicBlock(BB);
MachineBasicBlock *restoreMBB = MF->CreateMachineBasicBlock(BB);

MF->insert(I, mainMBB);
MF->insert(I, sinkMBB);
MF->push_back(restoreMBB);
restoreMBB->setMachineBlockAddressTaken();

MachineInstrBuilder MIB;

// Transfer the remainder of BB and its successor edges to sinkMBB.
sinkMBB->splice(sinkMBB->begin(), MBB,
std::next(MachineBasicBlock::iterator(MI)), MBB->end());
sinkMBB->transferSuccessorsAndUpdatePHIs(MBB);

// thisMBB:
const int64_t FPOffset = 0; // Slot 1.
const int64_t LabelOffset = 1 * PVT.getStoreSize(); // Slot 2.
const int64_t BCOffset = 2 * PVT.getStoreSize(); // Slot 3.
const int64_t SPOffset = 3 * PVT.getStoreSize(); // Slot 4.

// Buf address.
Register BufReg = MI.getOperand(1).getReg();

const TargetRegisterClass *PtrRC = getRegClassFor(PVT);
unsigned LabelReg = MRI.createVirtualRegister(PtrRC);

// Prepare IP for longjmp.
BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::LARL), LabelReg)
.addMBB(restoreMBB);
// Store IP for return from jmp, slot 2, offset = 1.
BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::STG))
.addReg(LabelReg)
.addReg(BufReg)
.addImm(LabelOffset)
.addReg(0);

auto *SpecialRegs = Subtarget.getSpecialRegisters();
bool HasFP = Subtarget.getFrameLowering()->hasFP(*MF);
if (HasFP) {
BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::STG))
.addReg(SpecialRegs->getFramePointerRegister())
.addReg(BufReg)
.addImm(FPOffset)
.addReg(0);
}

// Store SP.
BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::STG))
.addReg(SpecialRegs->getStackPointerRegister())
.addReg(BufReg)
.addImm(SPOffset)
.addReg(0);

// Slot 3(Offset = 2) Backchain value (if building with -mbackchain).
bool BackChain = MF->getSubtarget<SystemZSubtarget>().hasBackChain();
if (BackChain) {
Register BCReg = MRI.createVirtualRegister(RC);
auto *TFL = Subtarget.getFrameLowering<SystemZFrameLowering>();
MIB = BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::LG), BCReg)
.addReg(SpecialRegs->getStackPointerRegister())
.addImm(TFL->getBackchainOffset(*MF))
.addReg(0);

BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::STG))
.addReg(BCReg)
.addReg(BufReg)
.addImm(BCOffset)
.addReg(0);
}

// Setup.
MIB = BuildMI(*thisMBB, MI, DL, TII->get(SystemZ::EH_SjLj_Setup))
.addMBB(restoreMBB);

const SystemZRegisterInfo *RegInfo = Subtarget.getRegisterInfo();
MIB.addRegMask(RegInfo->getNoPreservedMask());

thisMBB->addSuccessor(mainMBB);
thisMBB->addSuccessor(restoreMBB);

// mainMBB:
BuildMI(mainMBB, DL, TII->get(SystemZ::LHI), mainDstReg).addImm(0);
mainMBB->addSuccessor(sinkMBB);

// sinkMBB:
BuildMI(*sinkMBB, sinkMBB->begin(), DL, TII->get(SystemZ::PHI), DstReg)
.addReg(mainDstReg)
.addMBB(mainMBB)
.addReg(restoreDstReg)
.addMBB(restoreMBB);

// restoreMBB.
BuildMI(restoreMBB, DL, TII->get(SystemZ::LHI), restoreDstReg).addImm(1);
BuildMI(restoreMBB, DL, TII->get(SystemZ::J)).addMBB(sinkMBB);
restoreMBB->addSuccessor(sinkMBB);

MI.eraseFromParent();

return sinkMBB;
}

MachineBasicBlock *
SystemZTargetLowering::emitEHSjLjLongJmp(MachineInstr &MI,
MachineBasicBlock *MBB) const {

DebugLoc DL = MI.getDebugLoc();
const TargetInstrInfo *TII = Subtarget.getInstrInfo();

MachineFunction *MF = MBB->getParent();
MachineRegisterInfo &MRI = MF->getRegInfo();

MVT PVT = getPointerTy(MF->getDataLayout());
assert((PVT == MVT::i64 || PVT == MVT::i32) && "Invalid Pointer Size!");
Register BufReg = MI.getOperand(0).getReg();
const TargetRegisterClass *RC = MRI.getRegClass(BufReg);
auto *SpecialRegs = Subtarget.getSpecialRegisters();

Register Tmp = MRI.createVirtualRegister(RC);
Register BCReg = MRI.createVirtualRegister(RC);

MachineInstrBuilder MIB;

const int64_t FPOffset = 0;
const int64_t LabelOffset = 1 * PVT.getStoreSize();
const int64_t BCOffset = 2 * PVT.getStoreSize();
const int64_t SPOffset = 3 * PVT.getStoreSize();
const int64_t LPOffset = 4 * PVT.getStoreSize();

MIB = BuildMI(*MBB, MI, DL, TII->get(SystemZ::LG), Tmp)
.addReg(BufReg)
.addImm(LabelOffset)
.addReg(0);

MIB = BuildMI(*MBB, MI, DL, TII->get(SystemZ::LG),
SpecialRegs->getFramePointerRegister())
.addReg(BufReg)
.addImm(FPOffset)
.addReg(0);

// We are restoring R13 even though we never stored in setjmp from llvm,
// as gcc always stores R13 in builtin_setjmp. We could have mixed code
// gcc setjmp and llvm longjmp.
MIB = BuildMI(*MBB, MI, DL, TII->get(SystemZ::LG), SystemZ::R13D)
.addReg(BufReg)
.addImm(LPOffset)
.addReg(0);

bool BackChain = MF->getSubtarget<SystemZSubtarget>().hasBackChain();
if (BackChain) {
MIB = BuildMI(*MBB, MI, DL, TII->get(SystemZ::LG), BCReg)
.addReg(BufReg)
.addImm(BCOffset)
.addReg(0);
}

MIB = BuildMI(*MBB, MI, DL, TII->get(SystemZ::LG),
SpecialRegs->getStackPointerRegister())
.addReg(BufReg)
.addImm(SPOffset)
.addReg(0);

if (BackChain) {
auto *TFL = Subtarget.getFrameLowering<SystemZFrameLowering>();
BuildMI(*MBB, MI, DL, TII->get(SystemZ::STG))
.addReg(BCReg)
.addReg(SpecialRegs->getStackPointerRegister())
.addImm(TFL->getBackchainOffset(*MF))
.addReg(0);
}

MIB = BuildMI(*MBB, MI, DL, TII->get(SystemZ::BR)).addReg(Tmp);

MI.eraseFromParent();
return MBB;
}

/// Returns true if stack probing through inline assembly is requested.
bool SystemZTargetLowering::hasInlineStackProbe(const MachineFunction &MF) const {
// If the function specifically requests inline stack probes, emit them.
Expand Down Expand Up @@ -6292,6 +6531,14 @@ SDValue SystemZTargetLowering::LowerOperation(SDValue Op,
return lowerGET_ROUNDING(Op, DAG);
case ISD::READCYCLECOUNTER:
return lowerREADCYCLECOUNTER(Op, DAG);
case ISD::EH_SJLJ_SETJMP:
case ISD::EH_SJLJ_LONGJMP:
// These operations action are Legal now, not Custom. The reason we need
// to keep it here is that common code treats these Pseudos as Custom,
// and expands them using EmitInstrWithCustomInserter in FinalizeISel.cpp
// after ISel.
return Op;

default:
llvm_unreachable("Unexpected node to lower");
}
Expand Down Expand Up @@ -9724,6 +9971,10 @@ MachineBasicBlock *SystemZTargetLowering::EmitInstrWithCustomInserter(

case SystemZ::PROBED_ALLOCA:
return emitProbedAlloca(MI, MBB);
case SystemZ::EH_SjLj_SetJmp:
return emitEHSjLjSetJmp(MI, MBB);
case SystemZ::EH_SjLj_LongJmp:
return emitEHSjLjLongJmp(MI, MBB);

case TargetOpcode::STACKMAP:
case TargetOpcode::PATCHPOINT:
Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Target/SystemZ/SystemZISelLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,12 @@ class SystemZTargetLowering : public TargetLowering {
// LD, and having the full constant in memory enables reg/mem opcodes.
return VT != MVT::f64;
}
MachineBasicBlock *emitEHSjLjSetJmp(MachineInstr &MI,
MachineBasicBlock *MBB) const;

MachineBasicBlock *emitEHSjLjLongJmp(MachineInstr &MI,
MachineBasicBlock *MBB) const;

bool hasInlineStackProbe(const MachineFunction &MF) const override;
AtomicExpansionKind shouldCastAtomicLoadInIR(LoadInst *LI) const override;
AtomicExpansionKind shouldCastAtomicStoreInIR(StoreInst *SI) const override;
Expand Down
Loading
Loading