Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions llvm/lib/IR/Verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3636,6 +3636,18 @@ void Verifier::visitCallBase(CallBase &Call) {
Check(isCallableCC(Call.getCallingConv()),
"calling convention does not permit calls", Call);

// Find the actual CC of the callee from the Module.
CallingConv::ID CalleeCC = Call.getParent()->getParent()->getParent()
->getFunction(Call.getCalledFunction()->getName())->getCallingConv();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very complicated way to spell Callee->getCallingConv() -- but note that Callee may be null here for indirect calls -- which is the cause of your four thousand or so test failures.

Copy link
Contributor Author

@jofrn jofrn Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it is, but we can have a call of the function without its CC specified. If its definition has a kernel CC, then the callsite may be treated as requiring the definition when it does not have it. If the Verifier checks from the module, then we will not error out in a location that requires the callsite to have the CC at its definition; we will error in the Verifier instead.

// Verify that a kernel does not call another kernel.
if (CalleeCC == CallingConv::AMDGPU_KERNEL ||
CalleeCC == CallingConv::SPIR_KERNEL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spir_kernel case not tested, nor call sites

CallingConv::ID CallerCC = Call.getParent()->getParent()->getCallingConv();
Check(CallerCC != CallingConv::AMDGPU_KERNEL &&
CallerCC != CallingConv::SPIR_KERNEL,
"a kernel may not call a kernel", Call.getParent()->getParent());
}

// Disallow passing/returning values with alignment higher than we can
// represent.
// FIXME: Consider making DataLayout cap the alignment, so this isn't
Expand Down
9 changes: 9 additions & 0 deletions llvm/test/Verifier/AMDGPU/kernel-recursivecall.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-disable-output is canonical way to disable output, also this does not require AMDGPU


define amdgpu_kernel void @kernel(ptr addrspace(1) %out, i32 %n) {
entry:
; CHECK: a kernel may not call a kernel
; CHECK-NEXT: ptr @kernel
call void @kernel(ptr addrspace(1) %out, i32 %n)
Copy link
Contributor

@shiltian shiltian Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This IR is malformed in the first place because it doesn't use the right CC for the callee.

Copy link
Contributor Author

@jofrn jofrn Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ends up getting the same error as your other one "Mark entry as invalid" if we do not have the CC match though, so it would be better to have this error in the Verifier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should deal with this in codegen. From an IR perspective, nothing is structurally wrong. Codegen should still be able to set up a call to the kernel as-if it were a callable function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it'd be a good idea. It will break a lot of assumptions and conventions we have for kernel function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't. This is still a UB call that can be treated as a no-op

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this rely on Codegen retrieving the CC from the module as well?

Not sure what you mean by this, but no. The calling convention is always known at the callsite and is absolute. The fact that the call target happens to be uncallable doesn't matter for the purposes of the call. The resource analysis looking for call sites should just ignore this call, it isn't real. It's impossible and can be treated as unreachable code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calling convention is always known at the callsite and is absolute.

That's why I said this IR is ill-formed in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IR is not ill-formed, it exhibits statically knowable undefined behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is that the callee's CC doesn't match the call site's, and it should be a separate issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is no verifier issue to solve here. The verifier is working as intended now

ret void
}
Loading