Skip to content

Conversation

@brandtbucher
Copy link
Contributor

CPython's experimental JIT compiler supports i686-pc-windows-msvc, and uses Clang to generate machine code templates at interpreter build time. preserve_none and musttail are used for efficient dispatch between these templates, but preserve_none isn't available for this target (leading to worse performance).

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir labels Jul 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 22, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-x86

Author: Brandt Bucher (brandtbucher)

Changes

CPython's experimental JIT compiler supports i686-pc-windows-msvc, and uses Clang to generate machine code templates at interpreter build time. preserve_none and musttail are used for efficient dispatch between these templates, but preserve_none isn't available for this target (leading to worse performance).


Patch is 28.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150106.diff

10 Files Affected:

  • (modified) clang/include/clang/Basic/AttrDocs.td (+3-1)
  • (modified) clang/lib/Basic/Targets/X86.h (+1)
  • (modified) clang/test/Sema/preserve-none-call-conv.c (+1)
  • (modified) llvm/docs/LangRef.rst (+1-1)
  • (modified) llvm/lib/Target/X86/X86CallingConv.td (+8)
  • (modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (+2-2)
  • (modified) llvm/test/CodeGen/X86/preserve_none_swift.ll (+1)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call.ll (+379-110)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call_win.ll (+59-17)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_musttail.ll (+1)
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index fefdaba7f8bf5..bebd8cc16893e 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -6433,13 +6433,15 @@ experimental at this time.
 def PreserveNoneDocs : Documentation {
   let Category = DocCatCallingConvs;
   let Content = [{
-On X86-64 and AArch64 targets, this attribute changes the calling convention of a function.
+On X86, X86-64, and AArch64 targets, this attribute changes the calling convention of a function.
 The ``preserve_none`` calling convention tries to preserve as few general
 registers as possible. So all general registers are caller saved registers. It
 also uses more general registers to pass arguments. This attribute doesn't
 impact floating-point registers. ``preserve_none``'s ABI is still unstable, and
 may be changed in the future.
 
+- On X86, only ESP and EBP are preserved by the callee. Registers EDI, ESI, EDX,
+  ECX, and EAX now can be used to pass function arguments.
 - On X86-64, only RSP and RBP are preserved by the callee.
   Registers R12, R13, R14, R15, RDI, RSI, RDX, RCX, R8, R9, R11, and RAX now can
   be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index ebc59c92f4c24..1d36f56b65a55 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -406,6 +406,7 @@ class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
     case CC_X86RegCall:
     case CC_C:
     case CC_PreserveMost:
+    case CC_PreserveNone:
     case CC_Swift:
     case CC_X86Pascal:
     case CC_IntelOclBicc:
diff --git a/clang/test/Sema/preserve-none-call-conv.c b/clang/test/Sema/preserve-none-call-conv.c
index fc9463726e3f5..6b6c9957c2ba4 100644
--- a/clang/test/Sema/preserve-none-call-conv.c
+++ b/clang/test/Sema/preserve-none-call-conv.c
@@ -1,5 +1,6 @@
 // RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify
 // RUN: %clang_cc1 %s -fsyntax-only -triple aarch64-unknown-unknown -verify
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify
 
 typedef void typedef_fun_t(int);
 
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 822e761444db7..fc9e37f1bfbc0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -463,7 +463,7 @@ added in the future:
     registers to pass arguments. This attribute doesn't impact non-general
     purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
     Non-general purpose registers still follow the standard C calling
-    convention. Currently it is for x86_64 and AArch64 only.
+    convention. Currently it is for x86, x86_64, and AArch64 only.
 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
     Clang generates an access function to access C++-style Thread Local Storage
     (TLS). The access function generally has an entry block, an exit block and an
diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index f020e0b55141c..6a8599a6c7c17 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1051,6 +1051,12 @@ def CC_X86_64_Preserve_None : CallingConv<[
   CCDelegateTo<CC_X86_64_C>
 ]>;
 
+def CC_X86_32_Preserve_None : CallingConv<[
+  // 32-bit variant of CC_X86_64_Preserve_None, above.
+  CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, EAX]>>,
+  CCDelegateTo<CC_X86_32_C>
+]>;
+
 //===----------------------------------------------------------------------===//
 // X86 Root Argument Calling Conventions
 //===----------------------------------------------------------------------===//
@@ -1072,6 +1078,7 @@ def CC_X86_32 : CallingConv<[
   CCIfCC<"CallingConv::X86_RegCall",
     CCIfSubtarget<"isTargetWin32()", CCIfRegCallv4<CCDelegateTo<CC_X86_32_RegCallv4_Win>>>>,
   CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,
+  CCIfCC<"CallingConv::PreserveNone", CCDelegateTo<CC_X86_32_Preserve_None>>,
 
   // Otherwise, drop to normal X86-32 CC
   CCDelegateTo<CC_X86_32_C>
@@ -1187,6 +1194,7 @@ def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX,
                                                       (sequence "K%u", 0, 7)),
                                                  (sequence "XMM%u", 0, 15))>;
 def CSR_64_NoneRegs    : CalleeSavedRegs<(add RBP)>;
+def CSR_32_NoneRegs    : CalleeSavedRegs<(add EBP)>;
 
 // Standard C + YMM6-15
 def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 83b11eede829e..facad368c66d9 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -316,7 +316,7 @@ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
       return CSR_64_RT_AllRegs_AVX_SaveList;
     return CSR_64_RT_AllRegs_SaveList;
   case CallingConv::PreserveNone:
-    return CSR_64_NoneRegs_SaveList;
+    return Is64Bit ? CSR_64_NoneRegs_SaveList : CSR_32_NoneRegs_SaveList;
   case CallingConv::CXX_FAST_TLS:
     if (Is64Bit)
       return MF->getInfo<X86MachineFunctionInfo>()->isSplitCSR() ?
@@ -444,7 +444,7 @@ X86RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
       return CSR_64_RT_AllRegs_AVX_RegMask;
     return CSR_64_RT_AllRegs_RegMask;
   case CallingConv::PreserveNone:
-    return CSR_64_NoneRegs_RegMask;
+    return Is64Bit ? CSR_64_NoneRegs_RegMask : CSR_32_NoneRegs_RegMask;
   case CallingConv::CXX_FAST_TLS:
     if (Is64Bit)
       return CSR_64_TLS_Darwin_RegMask;
diff --git a/llvm/test/CodeGen/X86/preserve_none_swift.ll b/llvm/test/CodeGen/X86/preserve_none_swift.ll
index 9a1c15190c6a2..bc64ee3b54f60 100644
--- a/llvm/test/CodeGen/X86/preserve_none_swift.ll
+++ b/llvm/test/CodeGen/X86/preserve_none_swift.ll
@@ -1,4 +1,5 @@
 ; RUN: not llc -mtriple=x86_64 %s -o - 2>&1 | FileCheck %s
+; RUN: not llc -mtriple=i686 %s -o - 2>&1 | FileCheck %s
 
 ; Swift attributes should not be used with preserve_none.
 
diff --git a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
index 500ebb139811a..c3044b42be35b 100644
--- a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
+++ b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
@@ -1,5 +1,6 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
-; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s --check-prefixes=X64
+; RUN: llc -mtriple=i686-unknown-unknown < %s | FileCheck %s --check-prefixes=X86
 
 ; This test checks various function call behaviors between preserve_none and
 ; normal calling conventions.
@@ -10,36 +11,57 @@ declare preserve_nonecc void @callee(ptr)
 ; of incompatible calling convention. Callee saved registers are saved/restored
 ; around the call.
 define void @caller1(ptr %a) {
-; CHECK-LABEL: caller1:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    pushq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    pushq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    pushq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    pushq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    pushq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-NEXT:    .cfi_offset %rbx, -48
-; CHECK-NEXT:    .cfi_offset %r12, -40
-; CHECK-NEXT:    .cfi_offset %r13, -32
-; CHECK-NEXT:    .cfi_offset %r14, -24
-; CHECK-NEXT:    .cfi_offset %r15, -16
-; CHECK-NEXT:    movq %rdi, %r12
-; CHECK-NEXT:    callq callee@PLT
-; CHECK-NEXT:    popq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    popq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    popq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    popq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    popq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 8
-; CHECK-NEXT:    retq
+; X64-LABEL: caller1:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    pushq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    pushq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    pushq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    pushq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 48
+; X64-NEXT:    .cfi_offset %rbx, -48
+; X64-NEXT:    .cfi_offset %r12, -40
+; X64-NEXT:    .cfi_offset %r13, -32
+; X64-NEXT:    .cfi_offset %r14, -24
+; X64-NEXT:    .cfi_offset %r15, -16
+; X64-NEXT:    movq %rdi, %r12
+; X64-NEXT:    callq callee@PLT
+; X64-NEXT:    popq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    popq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    popq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    popq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    popq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 8
+; X64-NEXT:    retq
+;
+; X86-LABEL: caller1:
+; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    pushl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    pushl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 16
+; X86-NEXT:    .cfi_offset %esi, -16
+; X86-NEXT:    .cfi_offset %edi, -12
+; X86-NEXT:    .cfi_offset %ebx, -8
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %edi
+; X86-NEXT:    calll callee@PLT
+; X86-NEXT:    popl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    popl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    popl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 4
+; X86-NEXT:    retl
   tail call preserve_nonecc void @callee(ptr %a)
   ret void
 }
@@ -48,98 +70,345 @@ define void @caller1(ptr %a) {
 ; The tail call is preserved. No registers are saved/restored around the call.
 ; Actually a simple jmp instruction is generated.
 define preserve_nonecc void @caller2(ptr %a) {
-; CHECK-LABEL: caller2:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    jmp callee@PLT # TAILCALL
+; X64-LABEL: caller2:
+; X64:       # %bb.0:
+; X64-NEXT:    jmp callee@PLT # TAILCALL
+;
+; X86-LABEL: caller2:
+; X86:       # %bb.0:
+; X86-NEXT:    jmp callee@PLT # TAILCALL
   tail call preserve_nonecc void @callee(ptr %a)
   ret void
 }
 
 ; Preserve_none function can use more registers to pass parameters.
-declare preserve_nonecc i64 @callee_with_many_param2(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
-define preserve_nonecc i64 @callee_with_many_param(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
-; CHECK-LABEL: callee_with_many_param:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    pushq %rax
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    movq %r13, %r12
-; CHECK-NEXT:    movq %r14, %r13
-; CHECK-NEXT:    movq %r15, %r14
-; CHECK-NEXT:    movq %rdi, %r15
-; CHECK-NEXT:    movq %rsi, %rdi
-; CHECK-NEXT:    movq %rdx, %rsi
-; CHECK-NEXT:    movq %rcx, %rdx
-; CHECK-NEXT:    movq %r8, %rcx
-; CHECK-NEXT:    movq %r9, %r8
-; CHECK-NEXT:    movq %r11, %r9
-; CHECK-NEXT:    movq %rax, %r11
-; CHECK-NEXT:    callq callee_with_many_param2@PLT
-; CHECK-NEXT:    popq %rcx
-; CHECK-NEXT:    .cfi_def_cfa_offset 8
-; CHECK-NEXT:    retq
-  %ret = call preserve_nonecc i64 @callee_with_many_param2(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
+declare preserve_nonecc i64 @callee_with_11_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
+define preserve_nonecc i64 @callee_with_12_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
+; X64-LABEL: callee_with_12_params:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %rax
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    movq %r13, %r12
+; X64-NEXT:    movq %r14, %r13
+; X64-NEXT:    movq %r15, %r14
+; X64-NEXT:    movq %rdi, %r15
+; X64-NEXT:    movq %rsi, %rdi
+; X64-NEXT:    movq %rdx, %rsi
+; X64-NEXT:    movq %rcx, %rdx
+; X64-NEXT:    movq %r8, %rcx
+; X64-NEXT:    movq %r9, %r8
+; X64-NEXT:    movq %r11, %r9
+; X64-NEXT:    movq %rax, %r11
+; X64-NEXT:    callq callee_with_11_params@PLT
+; X64-NEXT:    popq %rcx
+; X64-NEXT:    .cfi_def_cfa_offset 8
+; X64-NEXT:    retq
+;
+; X86-LABEL: callee_with_12_params:
+; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebp
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    .cfi_offset %ebp, -8
+; X86-NEXT:    movl %edx, %edi
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %ebx
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %ebp
+; X86-NEXT:    movl %ecx, %esi
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    movl %ebx, %ecx
+; X86-NEXT:    movl %ebp, %eax
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    calll callee_with_11_params@PLT
+; X86-NEXT:    addl $68, %esp
+; X86-NEXT:    .cfi_adjust_cfa_offset -68
+; X86-NEXT:    popl %ebp
+; X86-NEXT:    .cfi_def_cfa_offset 4
+; X86-NEXT:    retl
+  %ret = call preserve_nonecc i64 @callee_with_11_params(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
   ret i64 %ret
 }
 
 define i64 @caller3() {
-; CHECK-LABEL: caller3:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    pushq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    pushq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    pushq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    pushq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    pushq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-NEXT:    .cfi_offset %rbx, -48
-; CHECK-NEXT:    .cfi_offset %r12, -40
-; CHECK-NEXT:    .cfi_offset %r13, -32
-; CHECK-NEXT:    .cfi_offset %r14, -24
-; CHECK-NEXT:    .cfi_offset %r15, -16
-; CHECK-NEXT:    movl $1, %r12d
-; CHECK-NEXT:    movl $2, %r13d
-; CHECK-NEXT:    movl $3, %r14d
-; CHECK-NEXT:    movl $4, %r15d
-; CHECK-NEXT:    movl $5, %edi
-; CHECK-NEXT:    movl $6, %esi
-; CHECK-NEXT:    movl $7, %edx
-; CHECK-NEXT:    movl $8, %ecx
-; CHECK-NEXT:    movl $9, %r8d
-; CHECK-NEXT:    movl $10, %r9d
-; CHECK-NEXT:    movl $11, %r11d
-; CHECK-NEXT:    movl $12, %eax
-; CHECK-NEXT:    callq callee_with_many_param@PLT
-; CHECK-NEXT:    popq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    popq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    popq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    popq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    popq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 8
-; CHECK-NEXT:    retq
-  %ret = call preserve_nonecc i64 @callee_with_many_param(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
+; X64-LABEL: caller3:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    pushq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    pushq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    pushq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    pushq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 48
+; X64-NEXT:    .cfi_offset %rbx, -48
+; X64-NEXT:    .cfi_offset %r12, -40
+; X64-NEXT:    .cfi_offset %r13, -32
+; X64-NEXT:    .cfi_offset %r14, -24
+; X64-NEXT:    .cfi_offset %r15, -16
+; X64-NEXT:    movl $1, %r12d
+; X64-NEXT:    movl $2, %r13d
+; X64-NEXT:    movl $3, %r14d
+; X64-NEXT:    movl $4, %r15d
+; X64-NEXT:    movl $5, %edi
+; X64-NEXT:    movl $6, %esi
+; X64-NEXT:    movl $7, %edx
+; X64-NEXT:    movl $8, %ecx
+; X64-NEXT:    movl $9, %r8d
+; X64-NEXT:    movl $10, %r9d
+; X64-NEXT:    movl $11, %r11d
+; X64-NEXT:    movl $12, %eax
+; X64-NEXT:    callq callee_with_12_params@PLT
+; X64-NEXT:    popq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    popq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    popq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    popq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    popq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 8
+; X64-NEXT:    retq
+;
+; X86-LABEL: caller3:
+; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    pushl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    pushl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 16
+; X86-NEXT:    .cfi_offset %esi, -16
+; X86-NEXT:    .cfi_offset %edi, -12
+; X86-NEXT:    .cfi_offset %ebx, -8
+; X86-NEXT:    movl $1, %edi
+; X86-NEXT:    xorl %esi, %esi
+; X86-NEXT:    movl $2, %edx
+; X86-NEXT:    xorl %ecx, %ecx
+; X86-NEXT:    movl $3, %eax
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $12
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $11
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $10
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $9
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $8
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $7
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $6
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $5
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $4
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    calll callee_with_12_params@PLT
+; X86-NEXT:    addl $76, %esp
+; X86-NEXT:    .cfi_adjust_cfa_offset -76
+; X86-NEXT:    popl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    popl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    popl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 4
+; X86-NEXT:    retl
+  %ret = call preserve_nonecc i64 @callee_with_12_params(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
   ret i64 %ret
 }
 
+declare preserve_nonecc i32 @callee_with_4_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4)
+define preserve_nonecc i32 @callee_with_5_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5) {
+; X64-LABEL: callee_with_5_params:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %rax
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    movl %r13d, %r12d
+; X64-NEXT:    movl %r14d, %r13d
+; X64-NEXT:    movl %r15d, %r14d
+; X64-NEXT:    movl %edi, %r15d
+; X64-NEXT:    callq callee_with_4_params@PLT
+; X64-NEXT:    popq %rcx
...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 22, 2025

@llvm/pr-subscribers-llvm-ir

Author: Brandt Bucher (brandtbucher)

Changes

CPython's experimental JIT compiler supports i686-pc-windows-msvc, and uses Clang to generate machine code templates at interpreter build time. preserve_none and musttail are used for efficient dispatch between these templates, but preserve_none isn't available for this target (leading to worse performance).


Patch is 28.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150106.diff

10 Files Affected:

  • (modified) clang/include/clang/Basic/AttrDocs.td (+3-1)
  • (modified) clang/lib/Basic/Targets/X86.h (+1)
  • (modified) clang/test/Sema/preserve-none-call-conv.c (+1)
  • (modified) llvm/docs/LangRef.rst (+1-1)
  • (modified) llvm/lib/Target/X86/X86CallingConv.td (+8)
  • (modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (+2-2)
  • (modified) llvm/test/CodeGen/X86/preserve_none_swift.ll (+1)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call.ll (+379-110)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call_win.ll (+59-17)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_musttail.ll (+1)
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index fefdaba7f8bf5..bebd8cc16893e 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -6433,13 +6433,15 @@ experimental at this time.
 def PreserveNoneDocs : Documentation {
   let Category = DocCatCallingConvs;
   let Content = [{
-On X86-64 and AArch64 targets, this attribute changes the calling convention of a function.
+On X86, X86-64, and AArch64 targets, this attribute changes the calling convention of a function.
 The ``preserve_none`` calling convention tries to preserve as few general
 registers as possible. So all general registers are caller saved registers. It
 also uses more general registers to pass arguments. This attribute doesn't
 impact floating-point registers. ``preserve_none``'s ABI is still unstable, and
 may be changed in the future.
 
+- On X86, only ESP and EBP are preserved by the callee. Registers EDI, ESI, EDX,
+  ECX, and EAX now can be used to pass function arguments.
 - On X86-64, only RSP and RBP are preserved by the callee.
   Registers R12, R13, R14, R15, RDI, RSI, RDX, RCX, R8, R9, R11, and RAX now can
   be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index ebc59c92f4c24..1d36f56b65a55 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -406,6 +406,7 @@ class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
     case CC_X86RegCall:
     case CC_C:
     case CC_PreserveMost:
+    case CC_PreserveNone:
     case CC_Swift:
     case CC_X86Pascal:
     case CC_IntelOclBicc:
diff --git a/clang/test/Sema/preserve-none-call-conv.c b/clang/test/Sema/preserve-none-call-conv.c
index fc9463726e3f5..6b6c9957c2ba4 100644
--- a/clang/test/Sema/preserve-none-call-conv.c
+++ b/clang/test/Sema/preserve-none-call-conv.c
@@ -1,5 +1,6 @@
 // RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify
 // RUN: %clang_cc1 %s -fsyntax-only -triple aarch64-unknown-unknown -verify
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify
 
 typedef void typedef_fun_t(int);
 
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 822e761444db7..fc9e37f1bfbc0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -463,7 +463,7 @@ added in the future:
     registers to pass arguments. This attribute doesn't impact non-general
     purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
     Non-general purpose registers still follow the standard C calling
-    convention. Currently it is for x86_64 and AArch64 only.
+    convention. Currently it is for x86, x86_64, and AArch64 only.
 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
     Clang generates an access function to access C++-style Thread Local Storage
     (TLS). The access function generally has an entry block, an exit block and an
diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index f020e0b55141c..6a8599a6c7c17 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1051,6 +1051,12 @@ def CC_X86_64_Preserve_None : CallingConv<[
   CCDelegateTo<CC_X86_64_C>
 ]>;
 
+def CC_X86_32_Preserve_None : CallingConv<[
+  // 32-bit variant of CC_X86_64_Preserve_None, above.
+  CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, EAX]>>,
+  CCDelegateTo<CC_X86_32_C>
+]>;
+
 //===----------------------------------------------------------------------===//
 // X86 Root Argument Calling Conventions
 //===----------------------------------------------------------------------===//
@@ -1072,6 +1078,7 @@ def CC_X86_32 : CallingConv<[
   CCIfCC<"CallingConv::X86_RegCall",
     CCIfSubtarget<"isTargetWin32()", CCIfRegCallv4<CCDelegateTo<CC_X86_32_RegCallv4_Win>>>>,
   CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,
+  CCIfCC<"CallingConv::PreserveNone", CCDelegateTo<CC_X86_32_Preserve_None>>,
 
   // Otherwise, drop to normal X86-32 CC
   CCDelegateTo<CC_X86_32_C>
@@ -1187,6 +1194,7 @@ def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX,
                                                       (sequence "K%u", 0, 7)),
                                                  (sequence "XMM%u", 0, 15))>;
 def CSR_64_NoneRegs    : CalleeSavedRegs<(add RBP)>;
+def CSR_32_NoneRegs    : CalleeSavedRegs<(add EBP)>;
 
 // Standard C + YMM6-15
 def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 83b11eede829e..facad368c66d9 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -316,7 +316,7 @@ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
       return CSR_64_RT_AllRegs_AVX_SaveList;
     return CSR_64_RT_AllRegs_SaveList;
   case CallingConv::PreserveNone:
-    return CSR_64_NoneRegs_SaveList;
+    return Is64Bit ? CSR_64_NoneRegs_SaveList : CSR_32_NoneRegs_SaveList;
   case CallingConv::CXX_FAST_TLS:
     if (Is64Bit)
       return MF->getInfo<X86MachineFunctionInfo>()->isSplitCSR() ?
@@ -444,7 +444,7 @@ X86RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
       return CSR_64_RT_AllRegs_AVX_RegMask;
     return CSR_64_RT_AllRegs_RegMask;
   case CallingConv::PreserveNone:
-    return CSR_64_NoneRegs_RegMask;
+    return Is64Bit ? CSR_64_NoneRegs_RegMask : CSR_32_NoneRegs_RegMask;
   case CallingConv::CXX_FAST_TLS:
     if (Is64Bit)
       return CSR_64_TLS_Darwin_RegMask;
diff --git a/llvm/test/CodeGen/X86/preserve_none_swift.ll b/llvm/test/CodeGen/X86/preserve_none_swift.ll
index 9a1c15190c6a2..bc64ee3b54f60 100644
--- a/llvm/test/CodeGen/X86/preserve_none_swift.ll
+++ b/llvm/test/CodeGen/X86/preserve_none_swift.ll
@@ -1,4 +1,5 @@
 ; RUN: not llc -mtriple=x86_64 %s -o - 2>&1 | FileCheck %s
+; RUN: not llc -mtriple=i686 %s -o - 2>&1 | FileCheck %s
 
 ; Swift attributes should not be used with preserve_none.
 
diff --git a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
index 500ebb139811a..c3044b42be35b 100644
--- a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
+++ b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
@@ -1,5 +1,6 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
-; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s --check-prefixes=X64
+; RUN: llc -mtriple=i686-unknown-unknown < %s | FileCheck %s --check-prefixes=X86
 
 ; This test checks various function call behaviors between preserve_none and
 ; normal calling conventions.
@@ -10,36 +11,57 @@ declare preserve_nonecc void @callee(ptr)
 ; of incompatible calling convention. Callee saved registers are saved/restored
 ; around the call.
 define void @caller1(ptr %a) {
-; CHECK-LABEL: caller1:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    pushq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    pushq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    pushq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    pushq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    pushq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-NEXT:    .cfi_offset %rbx, -48
-; CHECK-NEXT:    .cfi_offset %r12, -40
-; CHECK-NEXT:    .cfi_offset %r13, -32
-; CHECK-NEXT:    .cfi_offset %r14, -24
-; CHECK-NEXT:    .cfi_offset %r15, -16
-; CHECK-NEXT:    movq %rdi, %r12
-; CHECK-NEXT:    callq callee@PLT
-; CHECK-NEXT:    popq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    popq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    popq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    popq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    popq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 8
-; CHECK-NEXT:    retq
+; X64-LABEL: caller1:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    pushq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    pushq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    pushq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    pushq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 48
+; X64-NEXT:    .cfi_offset %rbx, -48
+; X64-NEXT:    .cfi_offset %r12, -40
+; X64-NEXT:    .cfi_offset %r13, -32
+; X64-NEXT:    .cfi_offset %r14, -24
+; X64-NEXT:    .cfi_offset %r15, -16
+; X64-NEXT:    movq %rdi, %r12
+; X64-NEXT:    callq callee@PLT
+; X64-NEXT:    popq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    popq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    popq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    popq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    popq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 8
+; X64-NEXT:    retq
+;
+; X86-LABEL: caller1:
+; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    pushl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    pushl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 16
+; X86-NEXT:    .cfi_offset %esi, -16
+; X86-NEXT:    .cfi_offset %edi, -12
+; X86-NEXT:    .cfi_offset %ebx, -8
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %edi
+; X86-NEXT:    calll callee@PLT
+; X86-NEXT:    popl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    popl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    popl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 4
+; X86-NEXT:    retl
   tail call preserve_nonecc void @callee(ptr %a)
   ret void
 }
@@ -48,98 +70,345 @@ define void @caller1(ptr %a) {
 ; The tail call is preserved. No registers are saved/restored around the call.
 ; Actually a simple jmp instruction is generated.
 define preserve_nonecc void @caller2(ptr %a) {
-; CHECK-LABEL: caller2:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    jmp callee@PLT # TAILCALL
+; X64-LABEL: caller2:
+; X64:       # %bb.0:
+; X64-NEXT:    jmp callee@PLT # TAILCALL
+;
+; X86-LABEL: caller2:
+; X86:       # %bb.0:
+; X86-NEXT:    jmp callee@PLT # TAILCALL
   tail call preserve_nonecc void @callee(ptr %a)
   ret void
 }
 
 ; Preserve_none function can use more registers to pass parameters.
-declare preserve_nonecc i64 @callee_with_many_param2(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
-define preserve_nonecc i64 @callee_with_many_param(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
-; CHECK-LABEL: callee_with_many_param:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    pushq %rax
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    movq %r13, %r12
-; CHECK-NEXT:    movq %r14, %r13
-; CHECK-NEXT:    movq %r15, %r14
-; CHECK-NEXT:    movq %rdi, %r15
-; CHECK-NEXT:    movq %rsi, %rdi
-; CHECK-NEXT:    movq %rdx, %rsi
-; CHECK-NEXT:    movq %rcx, %rdx
-; CHECK-NEXT:    movq %r8, %rcx
-; CHECK-NEXT:    movq %r9, %r8
-; CHECK-NEXT:    movq %r11, %r9
-; CHECK-NEXT:    movq %rax, %r11
-; CHECK-NEXT:    callq callee_with_many_param2@PLT
-; CHECK-NEXT:    popq %rcx
-; CHECK-NEXT:    .cfi_def_cfa_offset 8
-; CHECK-NEXT:    retq
-  %ret = call preserve_nonecc i64 @callee_with_many_param2(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
+declare preserve_nonecc i64 @callee_with_11_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
+define preserve_nonecc i64 @callee_with_12_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
+; X64-LABEL: callee_with_12_params:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %rax
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    movq %r13, %r12
+; X64-NEXT:    movq %r14, %r13
+; X64-NEXT:    movq %r15, %r14
+; X64-NEXT:    movq %rdi, %r15
+; X64-NEXT:    movq %rsi, %rdi
+; X64-NEXT:    movq %rdx, %rsi
+; X64-NEXT:    movq %rcx, %rdx
+; X64-NEXT:    movq %r8, %rcx
+; X64-NEXT:    movq %r9, %r8
+; X64-NEXT:    movq %r11, %r9
+; X64-NEXT:    movq %rax, %r11
+; X64-NEXT:    callq callee_with_11_params@PLT
+; X64-NEXT:    popq %rcx
+; X64-NEXT:    .cfi_def_cfa_offset 8
+; X64-NEXT:    retq
+;
+; X86-LABEL: callee_with_12_params:
+; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebp
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    .cfi_offset %ebp, -8
+; X86-NEXT:    movl %edx, %edi
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %ebx
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %ebp
+; X86-NEXT:    movl %ecx, %esi
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    movl %ebx, %ecx
+; X86-NEXT:    movl %ebp, %eax
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl {{[0-9]+}}(%esp)
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    calll callee_with_11_params@PLT
+; X86-NEXT:    addl $68, %esp
+; X86-NEXT:    .cfi_adjust_cfa_offset -68
+; X86-NEXT:    popl %ebp
+; X86-NEXT:    .cfi_def_cfa_offset 4
+; X86-NEXT:    retl
+  %ret = call preserve_nonecc i64 @callee_with_11_params(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
   ret i64 %ret
 }
 
 define i64 @caller3() {
-; CHECK-LABEL: caller3:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    pushq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    pushq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    pushq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    pushq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    pushq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-NEXT:    .cfi_offset %rbx, -48
-; CHECK-NEXT:    .cfi_offset %r12, -40
-; CHECK-NEXT:    .cfi_offset %r13, -32
-; CHECK-NEXT:    .cfi_offset %r14, -24
-; CHECK-NEXT:    .cfi_offset %r15, -16
-; CHECK-NEXT:    movl $1, %r12d
-; CHECK-NEXT:    movl $2, %r13d
-; CHECK-NEXT:    movl $3, %r14d
-; CHECK-NEXT:    movl $4, %r15d
-; CHECK-NEXT:    movl $5, %edi
-; CHECK-NEXT:    movl $6, %esi
-; CHECK-NEXT:    movl $7, %edx
-; CHECK-NEXT:    movl $8, %ecx
-; CHECK-NEXT:    movl $9, %r8d
-; CHECK-NEXT:    movl $10, %r9d
-; CHECK-NEXT:    movl $11, %r11d
-; CHECK-NEXT:    movl $12, %eax
-; CHECK-NEXT:    callq callee_with_many_param@PLT
-; CHECK-NEXT:    popq %rbx
-; CHECK-NEXT:    .cfi_def_cfa_offset 40
-; CHECK-NEXT:    popq %r12
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
-; CHECK-NEXT:    popq %r13
-; CHECK-NEXT:    .cfi_def_cfa_offset 24
-; CHECK-NEXT:    popq %r14
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    popq %r15
-; CHECK-NEXT:    .cfi_def_cfa_offset 8
-; CHECK-NEXT:    retq
-  %ret = call preserve_nonecc i64 @callee_with_many_param(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
+; X64-LABEL: caller3:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    pushq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    pushq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    pushq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    pushq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 48
+; X64-NEXT:    .cfi_offset %rbx, -48
+; X64-NEXT:    .cfi_offset %r12, -40
+; X64-NEXT:    .cfi_offset %r13, -32
+; X64-NEXT:    .cfi_offset %r14, -24
+; X64-NEXT:    .cfi_offset %r15, -16
+; X64-NEXT:    movl $1, %r12d
+; X64-NEXT:    movl $2, %r13d
+; X64-NEXT:    movl $3, %r14d
+; X64-NEXT:    movl $4, %r15d
+; X64-NEXT:    movl $5, %edi
+; X64-NEXT:    movl $6, %esi
+; X64-NEXT:    movl $7, %edx
+; X64-NEXT:    movl $8, %ecx
+; X64-NEXT:    movl $9, %r8d
+; X64-NEXT:    movl $10, %r9d
+; X64-NEXT:    movl $11, %r11d
+; X64-NEXT:    movl $12, %eax
+; X64-NEXT:    callq callee_with_12_params@PLT
+; X64-NEXT:    popq %rbx
+; X64-NEXT:    .cfi_def_cfa_offset 40
+; X64-NEXT:    popq %r12
+; X64-NEXT:    .cfi_def_cfa_offset 32
+; X64-NEXT:    popq %r13
+; X64-NEXT:    .cfi_def_cfa_offset 24
+; X64-NEXT:    popq %r14
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    popq %r15
+; X64-NEXT:    .cfi_def_cfa_offset 8
+; X64-NEXT:    retq
+;
+; X86-LABEL: caller3:
+; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    pushl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    pushl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 16
+; X86-NEXT:    .cfi_offset %esi, -16
+; X86-NEXT:    .cfi_offset %edi, -12
+; X86-NEXT:    .cfi_offset %ebx, -8
+; X86-NEXT:    movl $1, %edi
+; X86-NEXT:    xorl %esi, %esi
+; X86-NEXT:    movl $2, %edx
+; X86-NEXT:    xorl %ecx, %ecx
+; X86-NEXT:    movl $3, %eax
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $12
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $11
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $10
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $9
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $8
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $7
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $6
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $5
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $4
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    pushl $0
+; X86-NEXT:    .cfi_adjust_cfa_offset 4
+; X86-NEXT:    calll callee_with_12_params@PLT
+; X86-NEXT:    addl $76, %esp
+; X86-NEXT:    .cfi_adjust_cfa_offset -76
+; X86-NEXT:    popl %esi
+; X86-NEXT:    .cfi_def_cfa_offset 12
+; X86-NEXT:    popl %edi
+; X86-NEXT:    .cfi_def_cfa_offset 8
+; X86-NEXT:    popl %ebx
+; X86-NEXT:    .cfi_def_cfa_offset 4
+; X86-NEXT:    retl
+  %ret = call preserve_nonecc i64 @callee_with_12_params(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
   ret i64 %ret
 }
 
+declare preserve_nonecc i32 @callee_with_4_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4)
+define preserve_nonecc i32 @callee_with_5_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5) {
+; X64-LABEL: callee_with_5_params:
+; X64:       # %bb.0:
+; X64-NEXT:    pushq %rax
+; X64-NEXT:    .cfi_def_cfa_offset 16
+; X64-NEXT:    movl %r13d, %r12d
+; X64-NEXT:    movl %r14d, %r13d
+; X64-NEXT:    movl %r15d, %r14d
+; X64-NEXT:    movl %edi, %r15d
+; X64-NEXT:    callq callee_with_4_params@PLT
+; X64-NEXT:    popq %rcx
...
[truncated]

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this interact with functions that need a base pointer (due to, for example, having both stack realignment and an alloca)? The X86ArgumentStackSlotRebase pass exists, but I don't think you're triggering it.

@MaskRay MaskRay requested review from RKSimon and phoebewang July 23, 2025 02:08
@brandtbucher
Copy link
Contributor Author

Thanks @efriedma-quic, I missed that the base pointer is ESI (not EBX) on 32-bit targets. I forgot about the "nest" parameter (ECX), and the GOT pointer (EBX) too.

How's this for the new parameter list?

diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index 6a8599a6c7c1..32eedcb9ca79 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1052,8 +1052,12 @@ def CC_X86_64_Preserve_None : CallingConv<[
 ]>;
 
 def CC_X86_32_Preserve_None : CallingConv<[
-  // 32-bit variant of CC_X86_64_Preserve_None, above.
-  CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, EAX]>>,
+  // 32-bit variant of CC_X86_64_Preserve_None, above. Use everything except:
+  //   - EBP        frame pointer
+  //   - ECX        'nest' parameter
+  //   - ESI        base pointer
+  //   - EBX        GOT pointer for PLT calls
+  CCIfType<[i32], CCAssignToReg<[EDI, EDX, EAX]>>,
   CCDelegateTo<CC_X86_32_C>
 ]>;

@efriedma-quic
Copy link
Collaborator

For "nest", we can forbid combining it with the preserves_none calling convention, probably, as long as we can detect it and error out. There's no reason anyone would combine the two.

For the base pointer, you also need to worry about the callee-save register list: I don't think we have code to properly save/restore the base pointer if it gets clobbered by a call.

@brandtbucher
Copy link
Contributor Author

Okay. Both of those concerns already apply to the existing 64-bit flavor, right? Would you prefer to see them addressed here, or in a dedicated follow-up?

For "nest", we can forbid combining it with the preserves_none calling convention, probably, as long as we can detect it and error out. There's no reason anyone would combine the two.

Makes sense. Where do you think the best place to do this is?

For the base pointer, you also need to worry about the callee-save register list: I don't think we have code to properly save/restore the base pointer if it gets clobbered by a call.

Okay, this is just adding them to the existing CSR_*_NoneRegs, correct?

@brandtbucher
Copy link
Contributor Author

brandtbucher commented Jul 23, 2025

How about we add the base pointers to both callee-save lists here and leave the nest parameter as a future improvement for another PR? The calling convention is ABI-unstable, so we can always tweak it later.

@efriedma-quic
Copy link
Collaborator

Okay, this is just adding them to the existing CSR_*_NoneRegs, correct?

Yes.

Makes sense. Where do you think the best place to do this is?

I think similar sorts of diagnostics are in llvm/lib/Target/X86/X86ISelLoweringCall.cpp . (Grep for errorUnsupported.)

How about we add the base pointers to both callee-save lists here and leave the nest parameter as a future improvement for another PR? The calling convention is ABI-unstable, so we can always tweak it later.

That's probably fine.

@efriedma-quic efriedma-quic requested review from jyknight and rnk July 24, 2025 23:51
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like a second set of eyes on this.

@brandtbucher
Copy link
Contributor Author

LGTM, but I'd like a second set of eyes on this.

Maybe @weiguozhi, the author of 64-bit preserve_none?

@weiguozhi
Copy link
Contributor

For the base pointer, you also need to worry about the callee-save register list: I don't think we have code to properly save/restore the base pointer if it gets clobbered by a call.

Function X86FrameLowering::spillFPBP can do this for clobbered base pointer and frame pointer.

@brandtbucher
Copy link
Contributor Author

Function X86FrameLowering::spillFPBP can do this for clobbered base pointer and frame pointer.

So the RBX/ESI base pointers can be removed from the callee-saved-regs lists?

diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index 9e5aaeb44334..32eedcb9ca79 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1197,8 +1197,8 @@ def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX,
                                                       (sequence "ZMM%u", 0, 31),
                                                       (sequence "K%u", 0, 7)),
                                                  (sequence "XMM%u", 0, 15))>;
-def CSR_64_NoneRegs    : CalleeSavedRegs<(add RBP, RBX)>;
-def CSR_32_NoneRegs    : CalleeSavedRegs<(add EBP, ESI)>;
+def CSR_64_NoneRegs    : CalleeSavedRegs<(add RBP)>;
+def CSR_32_NoneRegs    : CalleeSavedRegs<(add EBP)>;
 
 // Standard C + YMM6-15
 def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,

@weiguozhi
Copy link
Contributor

Function X86FrameLowering::spillFPBP can do this for clobbered base pointer and frame pointer.

So the RBX/ESI base pointers can be removed from the callee-saved-regs lists?

Yes, I think so.

@efriedma-quic
Copy link
Collaborator

If you can write a testcase that shows we properly spill the base pointer when necessary, fine. (I didn't realize there was already code to deal with this sort of thing.)

@brandtbucher
Copy link
Contributor Author

Looks like it is indeed being spilled, even if it's not in the callee-saved regs:

define i8 @caller_with_base_pointer(i32 %n) alignstack(32) {
; X64-LABEL: caller_with_base_pointer:
; X64:       # %bb.0:
; X64-NEXT:    pushq %rbp
; X64-NEXT:    .cfi_def_cfa_offset 16
; X64-NEXT:    .cfi_offset %rbp, -16
; X64-NEXT:    movq %rsp, %rbp
; X64-NEXT:    .cfi_def_cfa_register %rbp
; X64-NEXT:    pushq %r15
; X64-NEXT:    pushq %r14
; X64-NEXT:    pushq %r13
; X64-NEXT:    pushq %r12
; X64-NEXT:    pushq %rbx
; X64-NEXT:    andq $-32, %rsp
; X64-NEXT:    subq $64, %rsp
; X64-NEXT:    movq %rsp, %rbx
; X64-NEXT:    .cfi_offset %rbx, -56
; X64-NEXT:    .cfi_offset %r12, -48
; X64-NEXT:    .cfi_offset %r13, -40
; X64-NEXT:    .cfi_offset %r14, -32
; X64-NEXT:    .cfi_offset %r15, -24
; X64-NEXT:    movq %rsp, %r12
; X64-NEXT:    movq %r12, 32(%rbx) # 8-byte Spill
; X64-NEXT:    movl %edi, %eax
; X64-NEXT:    addq $15, %rax
; X64-NEXT:    andq $-16, %rax
; X64-NEXT:    subq %rax, %r12
; X64-NEXT:    movq %r12, %rsp
; X64-NEXT:    negq %rax
; X64-NEXT:    movq %rax, 24(%rbx) # 8-byte Spill
; X64-NEXT:    pushq %rbx
; X64-NEXT:    pushq %rax
; X64-NEXT:    callq callee@PLT
; X64-NEXT:    addq $8, %rsp
; X64-NEXT:    popq %rbx
; X64-NEXT:    movq 32(%rbx), %rax # 8-byte Reload
; X64-NEXT:    movq 24(%rbx), %rcx # 8-byte Reload
; X64-NEXT:    movzbl (%rax,%rcx), %eax
; X64-NEXT:    leaq -40(%rbp), %rsp
; X64-NEXT:    popq %rbx
; X64-NEXT:    popq %r12
; X64-NEXT:    popq %r13
; X64-NEXT:    popq %r14
; X64-NEXT:    popq %r15
; X64-NEXT:    popq %rbp
; X64-NEXT:    .cfi_def_cfa %rsp, 8
; X64-NEXT:    retq
;
; X86-LABEL: caller_with_base_pointer:
; X86:       # %bb.0:
; X86-NEXT:    pushl %ebp
; X86-NEXT:    .cfi_def_cfa_offset 8
; X86-NEXT:    .cfi_offset %ebp, -8
; X86-NEXT:    movl %esp, %ebp
; X86-NEXT:    .cfi_def_cfa_register %ebp
; X86-NEXT:    pushl %ebx
; X86-NEXT:    pushl %edi
; X86-NEXT:    pushl %esi
; X86-NEXT:    andl $-32, %esp
; X86-NEXT:    subl $32, %esp
; X86-NEXT:    movl %esp, %esi
; X86-NEXT:    .cfi_offset %esi, -20
; X86-NEXT:    .cfi_offset %edi, -16
; X86-NEXT:    .cfi_offset %ebx, -12
; X86-NEXT:    movl 8(%ebp), %eax
; X86-NEXT:    movl %esp, %edi
; X86-NEXT:    movl %edi, 8(%esi) # 4-byte Spill
; X86-NEXT:    addl $3, %eax
; X86-NEXT:    andl $-4, %eax
; X86-NEXT:    subl %eax, %edi
; X86-NEXT:    movl %edi, %esp
; X86-NEXT:    negl %eax
; X86-NEXT:    movl %eax, 4(%esi) # 4-byte Spill
; X86-NEXT:    pushl %esi
; X86-NEXT:    calll callee@PLT
; X86-NEXT:    popl %esi
; X86-NEXT:    movl 8(%esi), %eax # 4-byte Reload
; X86-NEXT:    movl 4(%esi), %ecx # 4-byte Reload
; X86-NEXT:    movzbl (%eax,%ecx), %eax
; X86-NEXT:    leal -12(%ebp), %esp
; X86-NEXT:    popl %esi
; X86-NEXT:    popl %edi
; X86-NEXT:    popl %ebx
; X86-NEXT:    popl %ebp
; X86-NEXT:    .cfi_def_cfa %esp, 4
; X86-NEXT:    retl
  %a = alloca i8, i32 %n
  call preserve_nonecc void @callee(ptr %a)
  %r = load i8, ptr %a
  ret i8 %r
}

I'll add this test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:ir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants