Skip to content
11 changes: 11 additions & 0 deletions llvm/include/llvm/ADT/PointerUnion.h
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,17 @@ struct CastInfo<To, const PointerUnion<PTs...>>
CastInfo<To, PointerUnion<PTs...>>> {
};

// The default implementation of isPresent() for nullable types returns true
// if the active member is not the first one, even if its value is nullptr.
// Override the default behavior to return false for all possible null values.
template <typename... PTs>
struct ValueIsPresent<PointerUnion<PTs...>,
std::enable_if_t<IsNullable<PointerUnion<PTs...>>>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enable_if needed? Isn't PointerUnion always nullable because it satisfies std::is_constructible_v<PointerUnion<PTs...>, std::nullptr_t>?

Copy link
Contributor Author

@s-barannikov s-barannikov Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this looks suspicious, but this is the only way I could get it compiled without errors.
I would appreciate it if someone could explain to me why this is necessary and how it can be simplified.

https://godbolt.org/z/x6dKhr5x1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the underlying issue is that without this, the main partial specialization from here would be as good of a match as this one:

static inline bool isPresent(const T &t) { return t != T(nullptr); }
. IE I don't think the exact predicate used in this enable_if matters as long as it evaluates to void at the end both L622 and here. In the godbolt link, enable_if<true> doesn't work because it appears in non-deduced context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to handle this would be to change the main ValueIsPresent for nullable types to cast to bool instead, or introduce some traits for nullable types that would tell you how to check for null values that you can specialize for PointerUnion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IE I don't think the exact predicate used in this enable_if matters as long as it evaluates to void at the end both L622 and here. In the godbolt link, enable_if<true> doesn't work because it appears in non-deduced context.

I tried

template <typename... PTs>
struct ValueIsPresent<PointerUnion<PTs...>, std::void_t<PTs...>>

which always evaluates to void and introduces(?) deduced context, but the error persists. Shouldn't this work?

Another way to handle this would be to change the main ValueIsPresent for nullable types to cast to bool instead, or introduce some traits for nullable types that would tell you how to check for null values that you can specialize for PointerUnion.

The existing implementation implicitly assumes that if a type is constructible from nullptr_t, then operator== exists and works the way that's expected in this particular use case. Apparently, this doesn't work for PointerUnion.

Casting to bool works for PointerUnion, but may in theory not work for other types. I'll make this change because it is simpler, but I think the best solution would be to stop making implicit assumptions about existence / behavior of operator== / operator bool() and require the clients to explicitly provide specializations if the behavior should diverge from the default.

using Union = PointerUnion<PTs...>;
static bool isPresent(const Union &V) { return static_cast<bool>(V); }
static decltype(auto) unwrapValue(Union &V) { return V; }
};

// Teach SmallPtrSet that PointerUnion is "basically a pointer", that has
// # low bits available = min(PT1bits,PT2bits)-1.
template <typename ...PTs>
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/CodeGen/RegisterBankInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -134,10 +134,10 @@ const TargetRegisterClass *RegisterBankInfo::constrainGenericRegister(

// If the register already has a class, fallback to MRI::constrainRegClass.
auto &RegClassOrBank = MRI.getRegClassOrRegBank(Reg);
if (isa<const TargetRegisterClass *>(RegClassOrBank))
if (isa_and_present<const TargetRegisterClass *>(RegClassOrBank))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think either of these 2 instances should ever encounter a register without a set class or bank, this is papering over a different bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input gMIR to instruction selector shouldn't contain registers without class/bank.
Such registers are created during instruction selection if an imported SelectionDAG pattern contains several instructions in the "destination DAG" of the pattern:

def : GCNPat <
  (UniformUnaryFrag<fabs> (v2f16 SReg_32:$src)),
  (S_AND_B32 SReg_32:$src, (S_MOV_B32 (i32 0x7fff7fff)))
>;

This is what -gen-global-isel generates for this pattern:

        GIR_MakeTempReg, /*TempRegID*/0, /*TypeID*/GILLT_s1,
        GIR_BuildMI, /*InsnID*/1, /*Opcode*/GIMT_Encode2(AMDGPU::S_MOV_B32),
        GIR_AddTempRegister, /*InsnID*/1, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
        ...
        GIR_ConstrainSelectedInstOperands, /*InsnID*/1,
        GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(AMDGPU::S_AND_B32),
        ...
        GIR_AddSimpleTempRegister, /*InsnID*/0, /*TempRegID*/0,
        GIR_RootConstrainSelectedInstOperands,

GIR_MakeTempReg creates a register without class/bank for the result of the S_MOV_B32. The register gets its class when executing GIR_ConstrainSelectedInstOperands action, which calls this function, which calls MRI.setRegClass() at the end.

I don't know if this should be considered a bug. If it should, I can try to address it separately (probably in #121270).


(Unrelated to this PR). Note that the type of the temporary register is s1. It is chosen arbitrarily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GIR_MakeTempReg creates a register without class/bank for the result of the S_MOV_B32.

As I mentioned in the other PR this is broken. In no context should an incomplete virtual register be used by an instruction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears there is another context when class/bank may not be set: 7e1f66d

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After regbankselect / in the selection pass, there must be a class or bank set. The null/null case is only valid before that, when a generic vreg must have a type

return MRI.constrainRegClass(Reg, &RC);

const RegisterBank *RB = cast<const RegisterBank *>(RegClassOrBank);
const auto *RB = dyn_cast_if_present<const RegisterBank *>(RegClassOrBank);
// Otherwise, all we can do is ensure the bank covers the class, and set it.
if (RB && !RB->covers(RC))
return nullptr;
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3708,10 +3708,10 @@ const TargetRegisterClass *
SIRegisterInfo::getConstrainedRegClassForOperand(const MachineOperand &MO,
const MachineRegisterInfo &MRI) const {
const RegClassOrRegBank &RCOrRB = MRI.getRegClassOrRegBank(MO.getReg());
if (const RegisterBank *RB = dyn_cast<const RegisterBank *>(RCOrRB))
if (const auto *RB = dyn_cast_if_present<const RegisterBank *>(RCOrRB))
return getRegClassForTypeOnBank(MRI.getType(MO.getReg()), *RB);

if (const auto *RC = dyn_cast<const TargetRegisterClass *>(RCOrRB))
if (const auto *RC = dyn_cast_if_present<const TargetRegisterClass *>(RCOrRB))
return getAllocatableClass(RC);

return nullptr;
Expand Down
5 changes: 5 additions & 0 deletions llvm/unittests/ADT/PointerUnionTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,11 @@ TEST_F(PointerUnionTest, NewCastInfra) {
EXPECT_FALSE(isa<float *>(d4null));
EXPECT_FALSE(isa<long long *>(d4null));

EXPECT_FALSE(isa_and_present<int *>(i4null));
EXPECT_FALSE(isa_and_present<float *>(f4null));
EXPECT_FALSE(isa_and_present<long long *>(l4null));
EXPECT_FALSE(isa_and_present<double *>(d4null));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, only the first isa_and_present returned false.


// test cast<>
EXPECT_EQ(cast<float *>(a), &f);
EXPECT_EQ(cast<int *>(b), &i);
Expand Down
Loading