Skip to content

Conversation

@dpaoliello
Copy link
Contributor

This was discovered while looking at the codegen for x64 when Control Flow Guard is enabled.

When using SelectionDAG, LLVM would generate the following sequence for a CF guarded indirect call:

	leaq	target_func(%rip), %rax
	rex64 jmpq	*__guard_dispatch_icall_fptr(%rip) # TAILCALL

However, when Fast ISel was used the following is generated:

	leaq	target_func(%rip), %rax
	movq	__guard_dispatch_icall_fptr(%rip), %rcx
	rex64 jmpq	*%rcx                   # TAILCALL

This was happening despite Fast ISel aborting and falling back to SelectionDAG.

The root cause for this code gen is that SelectionDAGISel has a special case when Fast ISel aborts when lowering a CallInst where it tries to lower the instruction as its own basic block, which for such a CF Guard call means that it is lowering an indirect call to __guard_dispatch_icall_fptr without observing that the function was being loaded into a pointer in the preceding (and bundled) instruction.

The fix for this is to not use the special case when a CallInst has bundled instructions: it's better to allow the call and its bundled instructions to be lowered together by SelectionDAG instead.

@dpaoliello dpaoliello requested a review from nikic October 10, 2025 17:39
@llvmbot llvmbot added backend:X86 llvm:SelectionDAG SelectionDAGISel as well labels Oct 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-selectiondag

Author: Daniel Paoliello (dpaoliello)

Changes

This was discovered while looking at the codegen for x64 when Control Flow Guard is enabled.

When using SelectionDAG, LLVM would generate the following sequence for a CF guarded indirect call:

	leaq	target_func(%rip), %rax
	rex64 jmpq	*__guard_dispatch_icall_fptr(%rip) # TAILCALL

However, when Fast ISel was used the following is generated:

	leaq	target_func(%rip), %rax
	movq	__guard_dispatch_icall_fptr(%rip), %rcx
	rex64 jmpq	*%rcx                   # TAILCALL

This was happening despite Fast ISel aborting and falling back to SelectionDAG.

The root cause for this code gen is that SelectionDAGISel has a special case when Fast ISel aborts when lowering a CallInst where it tries to lower the instruction as its own basic block, which for such a CF Guard call means that it is lowering an indirect call to __guard_dispatch_icall_fptr without observing that the function was being loaded into a pointer in the preceding (and bundled) instruction.

The fix for this is to not use the special case when a CallInst has bundled instructions: it's better to allow the call and its bundled instructions to be lowered together by SelectionDAG instead.


Full diff: https://github.com/llvm/llvm-project/pull/162895.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+8)
  • (modified) llvm/test/CodeGen/X86/cfguard-checks.ll (+6-3)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 175753f08d2b1..58539bd4cea3a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1825,6 +1825,14 @@ void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
 
           reportFastISelFailure(*MF, *ORE, R, EnableFastISelAbort > 2);
 
+          // If the call has operand bundles, then it's best if they are handled
+          // together with the call instead of selecting the call as its own
+          // block.
+          if (cast<CallInst>(Inst)->hasOperandBundles()) {
+            NumFastIselFailures += NumFastIselRemaining;
+            break;
+          }
+
           if (!Inst->getType()->isVoidTy() && !Inst->getType()->isTokenTy() &&
               !Inst->use_empty()) {
             Register &R = FuncInfo->ValueMap[Inst];
diff --git a/llvm/test/CodeGen/X86/cfguard-checks.ll b/llvm/test/CodeGen/X86/cfguard-checks.ll
index a727bbbfdcbe3..3a2de718e8a1b 100644
--- a/llvm/test/CodeGen/X86/cfguard-checks.ll
+++ b/llvm/test/CodeGen/X86/cfguard-checks.ll
@@ -1,7 +1,9 @@
 ; RUN: llc < %s -mtriple=i686-pc-windows-msvc | FileCheck %s -check-prefix=X86
-; RUN: llc < %s -mtriple=x86_64-pc-windows-msvc | FileCheck %s -check-prefixes=X64,X64_MSVC
+; RUN: llc < %s -mtriple=x86_64-pc-windows-msvc | FileCheck %s -check-prefixes=X64,X64_MSVC,X64_SELDAG
+; RUN: llc < %s --fast-isel -mtriple=x86_64-pc-windows-msvc | FileCheck %s -check-prefixes=X64,X64_MSVC,X64_FISEL
 ; RUN: llc < %s -mtriple=i686-w64-windows-gnu | FileCheck %s -check-prefixes=X86,X86_MINGW
-; RUN: llc < %s -mtriple=x86_64-w64-windows-gnu | FileCheck %s -check-prefixes=X64,X64_MINGW
+; RUN: llc < %s -mtriple=x86_64-w64-windows-gnu | FileCheck %s -check-prefixes=X64,X64_MINGW,X64_SELDAG
+; RUN: llc < %s --fast-isel -mtriple=x86_64-w64-windows-gnu | FileCheck %s -check-prefixes=X64,X64_MINGW,X64_FISEL
 ; Control Flow Guard is currently only available on Windows
 
 ; Test that Control Flow Guard checks are correctly added when required.
@@ -27,7 +29,8 @@ entry:
   ; X64-LABEL: func_guard_nocf
   ; X64:       leaq	target_func(%rip), %rax
   ; X64-NOT: __guard_dispatch_icall_fptr
-  ; X64:       callq	*%rax
+  ; X64_SELDAG: callq	*%rax
+  ; X64_FISEL: callq	*32(%rsp)
 }
 attributes #0 = { "guard_nocf" }
 


reportFastISelFailure(*MF, *ORE, R, EnableFastISelAbort > 2);

// If the call has operand bundles, then it's best if they are handled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the flow here. These seems pretty far removed from where the problem is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause of the problem is that SelectionDAG is designed to lower CallInst and its bundled ops together: for example, both the indirect call and the address-of GlobalValue that are generated in CF Guard (which can then be lowered as a call to the address of a GlobalValue, instead of reading the value and then calling to a register).

I'm not sure why SelectionDAGISel has this special case here (I assume to try to lower calls via SelectionDAG but then allow the rest of the block to be handled by FastISel), but it breaks that assumption and causes suboptimal codegen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:X86 llvm:SelectionDAG SelectionDAGISel as well

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants