Skip to content

Conversation

@djtodoro
Copy link
Collaborator

@djtodoro djtodoro commented Oct 3, 2025

Properly handle clang::musttail attribute on MIPS backend.

It fixes: #161193

Properly handle clang::musttail attribute on MIPS backend.
@llvmbot
Copy link
Member

llvmbot commented Oct 3, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-mips

Author: Djordje Todorovic (djtodoro)

Changes

Properly handle clang::musttail attribute on MIPS backend.

It fixes: #161193


Full diff: https://github.com/llvm/llvm-project/pull/161860.diff

2 Files Affected:

  • (modified) llvm/lib/Target/Mips/MipsISelLowering.cpp (+4-3)
  • (added) llvm/test/CodeGen/Mips/musttail.ll (+38)
diff --git a/llvm/lib/Target/Mips/MipsISelLowering.cpp b/llvm/lib/Target/Mips/MipsISelLowering.cpp
index 881ba8e2f9eff..d79db127f01a2 100644
--- a/llvm/lib/Target/Mips/MipsISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsISelLowering.cpp
@@ -3407,7 +3407,8 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
   // Check if it's really possible to do a tail call. Restrict it to functions
   // that are part of this compilation unit.
   bool InternalLinkage = false;
-  if (IsTailCall) {
+  bool IsMustTail = CLI.CB && CLI.CB->isMustTailCall();
+  if (IsTailCall && !IsMustTail) {
     IsTailCall = isEligibleForTailCallOptimization(
         CCInfo, StackSize, *MF.getInfo<MipsFunctionInfo>());
     if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) {
@@ -3416,9 +3417,9 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
                      G->getGlobal()->hasPrivateLinkage() ||
                      G->getGlobal()->hasHiddenVisibility() ||
                      G->getGlobal()->hasProtectedVisibility());
-     }
+    }
   }
-  if (!IsTailCall && CLI.CB && CLI.CB->isMustTailCall())
+  if (!IsTailCall && IsMustTail)
     report_fatal_error("failed to perform tail call elimination on a call "
                        "site marked musttail");
 
diff --git a/llvm/test/CodeGen/Mips/musttail.ll b/llvm/test/CodeGen/Mips/musttail.ll
new file mode 100644
index 0000000000000..d5f457f6eb665
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/musttail.ll
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=mips-unknown-linux-gnu < %s | FileCheck %s --check-prefix=MIPS32
+; RUN: llc -mtriple=mips64-unknown-linux-gnu < %s | FileCheck %s --check-prefix=MIPS64
+
+; Test musttail support for MIPS
+
+declare void @external_func()
+
+; Test basic musttail with external function
+define void @test_musttail_external() {
+; MIPS32-LABEL: test_musttail_external:
+; MIPS32:       # %bb.0:
+; MIPS32-NEXT:    j external_func
+; MIPS32-NEXT:    nop
+;
+; MIPS64-LABEL: test_musttail_external:
+; MIPS64:       # %bb.0:
+; MIPS64-NEXT:    j external_func
+; MIPS64-NEXT:    nop
+  musttail call void @external_func()
+  ret void
+}
+
+declare i32 @callee_args(i32 %a, i32 %b, i32 %c)
+
+define i32 @test_musttail_args(i32 %x, i32 %y, i32 %z) {
+; MIPS32-LABEL: test_musttail_args:
+; MIPS32:       # %bb.0:
+; MIPS32-NEXT:    j callee_args
+; MIPS32-NEXT:    nop
+;
+; MIPS64-LABEL: test_musttail_args:
+; MIPS64:       # %bb.0:
+; MIPS64-NEXT:    j callee_args
+; MIPS64-NEXT:    nop
+  %ret = musttail call i32 @callee_args(i32 %x, i32 %y, i32 %z)
+  ret i32 %ret
+}

Copy link
Contributor

@kraj kraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh don't we still need to check it's eligible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that thing confuses me in the case of musttail. On ARM, we do check it. Lets do it for MIPS as well.

Copy link
Collaborator Author

@djtodoro djtodoro Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrtc27 yep, it does make sense to still check it. I incorporated in the new commit (basically - do not check for the optional mips flag for enabling tail call optimizations in case of musttial, and I also enabled tail call opts for dso_local functions (1dcb9110612ab) ). Thanks for the review!

Copy link
Collaborator

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@djtodoro
Copy link
Collaborator Author

djtodoro commented Oct 7, 2025

I checked the issues listed above, all cases from those still pass with the new patch.

@github-actions
Copy link

github-actions bot commented Oct 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@kraj
Copy link
Contributor

kraj commented Oct 8, 2025

Please squash changes into initial commit

@djtodoro
Copy link
Collaborator Author

djtodoro commented Oct 8, 2025

Please squash changes into initial commit

Per "Contributing to LLVM" Policy, we do not do force-push (and squash), we rather use incremental changes to address comments during review process. And, once the pull request is approved, we select the “Squash and merge” button.

Copy link
Contributor

@kraj kraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my checks pass with this series as well.

@kraj
Copy link
Contributor

kraj commented Oct 8, 2025

Please squash changes into initial commit

Per "Contributing to LLVM" Policy, we do not do force-push (and squash), we rather use incremental changes to address comments during review process. And, once the pull request is approved, we select the “Squash and merge” button.

right, squashing is what I meant.

@djtodoro
Copy link
Collaborator Author

I will wait for @jrtc27 to have another look. :)

@brad0
Copy link
Contributor

brad0 commented Oct 10, 2025

@jrtc27 @yingopq Ping.

@brad0 brad0 requested a review from efriedma-quic October 16, 2025 20:45
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to avoid using a default argument here, if there aren't too many callers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasLocalLinkage overlaps with some of the other checks here.

Where does this list of checks come from? What are we trying to avoid here? I don't see any obvious reason why you can't tail-call an external function. If there is an issue, there should be a testcase with an explanation.

Copy link
Collaborator Author

@djtodoro djtodoro Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @efriedma-quic.

hasLocalLinkage overlaps with some of the other checks here.

I agree, will address it.

Where does this list of checks come from? What are we trying to avoid here? I don't see any obvious reason why you can't tail-call an external function. If there is an issue, there should be a testcase with an explanation.

I was not sure as well, but after some investigation: The problem caused it was around $gp register. In PIC mode, calling external functions via tail call can cause issues with $gp register handling: https://reviews.llvm.org/D24763

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the problem here similar to #98859? i.e. tail-call optimization conflicts with optimizations for setting up $gp?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, seems very similar

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'd like to see a little more test coverage. Particularly for calling a non-local function, and for more complicated arguments lists that involve memory.

@brad0 brad0 requested a review from efriedma-quic October 17, 2025 16:43
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clean up this InternalLinkage variable? The remaining usage doesn't seem like it's connected to this code.

Is musttail of a function pointer okay?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clean up this InternalLinkage variable? The remaining usage doesn't seem like it's connected to this code.

yep, it makes sense

Is musttail of a function pointer okay?

yes, I will add a test case for it

@brad0 brad0 requested a review from efriedma-quic October 19, 2025 04:40
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a ABI problem, we need to reject musttail calls which would trigger that problem. A miscompile is a lot worse than a backend error.

If there's not an ABI problem, I'm not sure why we'd need these checks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. I think that what we did in case of PPC is more appropriate fix for this.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 29, 2025
@ameisen
Copy link

ameisen commented Oct 30, 2025

This PR doesn't fix (if it was intended to) sibling call optimizations.

It does change the code generation for them, however - but it maintains all of the stack adjustments as well as the fp/sp moves.

__attribute__((noinline))
int foo0()
{
	volatile int a;
	return a += 1;
}

int bar0()
{
	return foo0();
}

with tail calls disabled (-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm -mips-tail-calls=0):

	addiu	$sp, $sp, -24
	sw	$ra, 20($sp)                    # 4-byte Folded Spill
	sw	$fp, 16($sp)                    # 4-byte Folded Spill
	move	$fp, $sp
	jal	_Z4foo0v
	nop
	move	$sp, $fp
	lw	$fp, 16($sp)                    # 4-byte Folded Reload
	lw	$ra, 20($sp)                    # 4-byte Folded Reload
	addiu	$sp, $sp, 24
	jrc	$ra

with tail calls enabled (-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm -mips-tail-calls=1):

	addiu	$sp, $sp, -8
	sw	$ra, 4($sp)                     # 4-byte Folded Spill
	sw	$fp, 0($sp)                     # 4-byte Folded Spill
	move	$fp, $sp
	move	$sp, $fp
	lw	$fp, 0($sp)                     # 4-byte Folded Reload
	lw	$ra, 4($sp)                     # 4-byte Folded Reload
	j	_Z4foo0v
	addiu	$sp, $sp, 8

What is expected:

	j	_Z4foo0v
	nop

or

	bc	_Z4foo0v

So, it's moving the j to the end, but it's not cleaning up the stack changes that were around the jump. Everything in here but the j could be removed (with a nop, but the j should be replaced with a bc anyways), but this effectively is generating a rather messed up function.

A significant amount of logic for handling this issue is present in LowerCall for PPC and for AArch64 (and the rest), but is missing in the MIPS version.

@djtodoro
Copy link
Collaborator Author

This PR doesn't fix (if it was intended to) sibling call optimizations.

It does change the code generation for them, however - but it maintains all of the stack adjustments as well as the fp/sp moves.

__attribute__((noinline))
int foo0()
{
	volatile int a;
	return a += 1;
}

int bar0()
{
	return foo0();
}

with tail calls disabled (-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm -mips-tail-calls=0):

	addiu	$sp, $sp, -24
	sw	$ra, 20($sp)                    # 4-byte Folded Spill
	sw	$fp, 16($sp)                    # 4-byte Folded Spill
	move	$fp, $sp
	jal	_Z4foo0v
	nop
	move	$sp, $fp
	lw	$fp, 16($sp)                    # 4-byte Folded Reload
	lw	$ra, 20($sp)                    # 4-byte Folded Reload
	addiu	$sp, $sp, 24
	jrc	$ra

with tail calls enabled (-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm -mips-tail-calls=1):

	addiu	$sp, $sp, -8
	sw	$ra, 4($sp)                     # 4-byte Folded Spill
	sw	$fp, 0($sp)                     # 4-byte Folded Spill
	move	$fp, $sp
	move	$sp, $fp
	lw	$fp, 0($sp)                     # 4-byte Folded Reload
	lw	$ra, 4($sp)                     # 4-byte Folded Reload
	j	_Z4foo0v
	addiu	$sp, $sp, 8

What is expected:

	j	_Z4foo0v
	nop

or

	bc	_Z4foo0v

So, it's moving the j to the end, but it's not cleaning up the stack changes that were around the jump. Everything in here but the j could be removed (with a nop, but the j should be replaced with a bc anyways), but this effectively is generating a rather messed up function.

A significant amount of logic for handling this issue is present in LowerCall for PPC and for AArch64 (and the rest), but is missing in the MIPS version.

Hey @ameisen, thanks a lot. Yes, I agree, it is missing for MIPS case in the ISel Lowering, but I think this should be a separate github issue, and separate PR should be created for this. Can you please create an issue with the content from this comment? Thanks in advance.

@ameisen
Copy link

ameisen commented Oct 31, 2025

This PR doesn't fix (if it was intended to) sibling call optimizations.
It does change the code generation for them, however - but it maintains all of the stack adjustments as well as the fp/sp moves.

__attribute__((noinline))
int foo0()
{
	volatile int a;
	return a += 1;
}

int bar0()
{
	return foo0();
}

with tail calls disabled (-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm -mips-tail-calls=0):

	addiu	$sp, $sp, -24
	sw	$ra, 20($sp)                    # 4-byte Folded Spill
	sw	$fp, 16($sp)                    # 4-byte Folded Spill
	move	$fp, $sp
	jal	_Z4foo0v
	nop
	move	$sp, $fp
	lw	$fp, 16($sp)                    # 4-byte Folded Reload
	lw	$ra, 20($sp)                    # 4-byte Folded Reload
	addiu	$sp, $sp, 24
	jrc	$ra

with tail calls enabled (-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm -mips-tail-calls=1):

	addiu	$sp, $sp, -8
	sw	$ra, 4($sp)                     # 4-byte Folded Spill
	sw	$fp, 0($sp)                     # 4-byte Folded Spill
	move	$fp, $sp
	move	$sp, $fp
	lw	$fp, 0($sp)                     # 4-byte Folded Reload
	lw	$ra, 4($sp)                     # 4-byte Folded Reload
	j	_Z4foo0v
	addiu	$sp, $sp, 8

What is expected:

	j	_Z4foo0v
	nop

or

	bc	_Z4foo0v

So, it's moving the j to the end, but it's not cleaning up the stack changes that were around the jump. Everything in here but the j could be removed (with a nop, but the j should be replaced with a bc anyways), but this effectively is generating a rather messed up function.
A significant amount of logic for handling this issue is present in LowerCall for PPC and for AArch64 (and the rest), but is missing in the MIPS version.

Hey @ameisen, thanks a lot. Yes, I agree, it is missing for MIPS case in the ISel Lowering, but I think this should be a separate github issue, and separate PR should be created for this. Can you please create an issue with the content from this comment? Thanks in advance.

I'm trying to figure out how to word the issue, as without your PR clang doesn't even turn it into a tail call at all. Since sibling optimizations are based upon tail call optimizations (they're a subset of them) it complicates an issue as I'm not sure how to write an issue dependent upon a PR.

I created this issue for now: #165906.

@ameisen
Copy link

ameisen commented Oct 31, 2025

As a separate problem, with that simple code sample, if I add the musttail attribute, with your PR I am getting this error:

'musttail' attribute for this call is impossible because calls outside the current linkage unit cannot be tail called on MIPS

If I make foo0 static it works, but that shouldn't be necessary in this situation. I'm also confused as to why it's partially performing the tail call optimization without musttail, but musttail fails.

When it's static, though, it's just inlining it instead, ignoring the noinline attribute.

--

Edit: The conditions that you use to check for the validity of musttail differ from from the conditions used in LowerCall to determine is a tail-call is possible. You are stricter in what the linkage of the function must be.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@ameisen ameisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review related to the error that I encountered.

if (!FD->isDefined()) {
CGM.addUndefinedGlobalForTailCall({FD, Loc});
} else {
llvm::GlobalValue::LinkageTypes Linkage =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditions used here differ from the conditions used in MipsISelLowering. This causes it to be inconsistent and results in the error I provided in conversations.

Locally, I used this which still isn't correct but seems to work more consistently:

bool HasLocalLinkage =
  llvm::GlobalValue::isLocalLinkage(Linkage) || llvm::GlobalValue::isPrivateLinkage(Linkage);
if (!HasLocalLinkage || llvm::GlobalValue::isWeakForLinker(Linkage))

getDiags().Report(I.second, diag::err_mips_impossible_musttail) << 1;
continue;
}
llvm::GlobalValue::LinkageTypes Linkage =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in CGCall.cpp, the conditions used here differ from the conditions used in MipsISelLowering.

@brad0
Copy link
Contributor

brad0 commented Nov 18, 2025

@djtodoro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:MIPS clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[[clang::musttail]] triggers backend crash in llc on MIPS

7 participants