-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Implement preserve_none for 32-bit x86
#150106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-x86 Author: Brandt Bucher (brandtbucher) ChangesCPython's experimental JIT compiler supports Patch is 28.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150106.diff 10 Files Affected:
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index fefdaba7f8bf5..bebd8cc16893e 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -6433,13 +6433,15 @@ experimental at this time.
def PreserveNoneDocs : Documentation {
let Category = DocCatCallingConvs;
let Content = [{
-On X86-64 and AArch64 targets, this attribute changes the calling convention of a function.
+On X86, X86-64, and AArch64 targets, this attribute changes the calling convention of a function.
The ``preserve_none`` calling convention tries to preserve as few general
registers as possible. So all general registers are caller saved registers. It
also uses more general registers to pass arguments. This attribute doesn't
impact floating-point registers. ``preserve_none``'s ABI is still unstable, and
may be changed in the future.
+- On X86, only ESP and EBP are preserved by the callee. Registers EDI, ESI, EDX,
+ ECX, and EAX now can be used to pass function arguments.
- On X86-64, only RSP and RBP are preserved by the callee.
Registers R12, R13, R14, R15, RDI, RSI, RDX, RCX, R8, R9, R11, and RAX now can
be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index ebc59c92f4c24..1d36f56b65a55 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -406,6 +406,7 @@ class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
case CC_X86RegCall:
case CC_C:
case CC_PreserveMost:
+ case CC_PreserveNone:
case CC_Swift:
case CC_X86Pascal:
case CC_IntelOclBicc:
diff --git a/clang/test/Sema/preserve-none-call-conv.c b/clang/test/Sema/preserve-none-call-conv.c
index fc9463726e3f5..6b6c9957c2ba4 100644
--- a/clang/test/Sema/preserve-none-call-conv.c
+++ b/clang/test/Sema/preserve-none-call-conv.c
@@ -1,5 +1,6 @@
// RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify
// RUN: %clang_cc1 %s -fsyntax-only -triple aarch64-unknown-unknown -verify
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify
typedef void typedef_fun_t(int);
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 822e761444db7..fc9e37f1bfbc0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -463,7 +463,7 @@ added in the future:
registers to pass arguments. This attribute doesn't impact non-general
purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
Non-general purpose registers still follow the standard C calling
- convention. Currently it is for x86_64 and AArch64 only.
+ convention. Currently it is for x86, x86_64, and AArch64 only.
"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
Clang generates an access function to access C++-style Thread Local Storage
(TLS). The access function generally has an entry block, an exit block and an
diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index f020e0b55141c..6a8599a6c7c17 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1051,6 +1051,12 @@ def CC_X86_64_Preserve_None : CallingConv<[
CCDelegateTo<CC_X86_64_C>
]>;
+def CC_X86_32_Preserve_None : CallingConv<[
+ // 32-bit variant of CC_X86_64_Preserve_None, above.
+ CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, EAX]>>,
+ CCDelegateTo<CC_X86_32_C>
+]>;
+
//===----------------------------------------------------------------------===//
// X86 Root Argument Calling Conventions
//===----------------------------------------------------------------------===//
@@ -1072,6 +1078,7 @@ def CC_X86_32 : CallingConv<[
CCIfCC<"CallingConv::X86_RegCall",
CCIfSubtarget<"isTargetWin32()", CCIfRegCallv4<CCDelegateTo<CC_X86_32_RegCallv4_Win>>>>,
CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,
+ CCIfCC<"CallingConv::PreserveNone", CCDelegateTo<CC_X86_32_Preserve_None>>,
// Otherwise, drop to normal X86-32 CC
CCDelegateTo<CC_X86_32_C>
@@ -1187,6 +1194,7 @@ def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX,
(sequence "K%u", 0, 7)),
(sequence "XMM%u", 0, 15))>;
def CSR_64_NoneRegs : CalleeSavedRegs<(add RBP)>;
+def CSR_32_NoneRegs : CalleeSavedRegs<(add EBP)>;
// Standard C + YMM6-15
def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 83b11eede829e..facad368c66d9 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -316,7 +316,7 @@ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
return CSR_64_RT_AllRegs_AVX_SaveList;
return CSR_64_RT_AllRegs_SaveList;
case CallingConv::PreserveNone:
- return CSR_64_NoneRegs_SaveList;
+ return Is64Bit ? CSR_64_NoneRegs_SaveList : CSR_32_NoneRegs_SaveList;
case CallingConv::CXX_FAST_TLS:
if (Is64Bit)
return MF->getInfo<X86MachineFunctionInfo>()->isSplitCSR() ?
@@ -444,7 +444,7 @@ X86RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
return CSR_64_RT_AllRegs_AVX_RegMask;
return CSR_64_RT_AllRegs_RegMask;
case CallingConv::PreserveNone:
- return CSR_64_NoneRegs_RegMask;
+ return Is64Bit ? CSR_64_NoneRegs_RegMask : CSR_32_NoneRegs_RegMask;
case CallingConv::CXX_FAST_TLS:
if (Is64Bit)
return CSR_64_TLS_Darwin_RegMask;
diff --git a/llvm/test/CodeGen/X86/preserve_none_swift.ll b/llvm/test/CodeGen/X86/preserve_none_swift.ll
index 9a1c15190c6a2..bc64ee3b54f60 100644
--- a/llvm/test/CodeGen/X86/preserve_none_swift.ll
+++ b/llvm/test/CodeGen/X86/preserve_none_swift.ll
@@ -1,4 +1,5 @@
; RUN: not llc -mtriple=x86_64 %s -o - 2>&1 | FileCheck %s
+; RUN: not llc -mtriple=i686 %s -o - 2>&1 | FileCheck %s
; Swift attributes should not be used with preserve_none.
diff --git a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
index 500ebb139811a..c3044b42be35b 100644
--- a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
+++ b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
@@ -1,5 +1,6 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
-; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s --check-prefixes=X64
+; RUN: llc -mtriple=i686-unknown-unknown < %s | FileCheck %s --check-prefixes=X86
; This test checks various function call behaviors between preserve_none and
; normal calling conventions.
@@ -10,36 +11,57 @@ declare preserve_nonecc void @callee(ptr)
; of incompatible calling convention. Callee saved registers are saved/restored
; around the call.
define void @caller1(ptr %a) {
-; CHECK-LABEL: caller1:
-; CHECK: # %bb.0:
-; CHECK-NEXT: pushq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: pushq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: pushq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: pushq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: pushq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 48
-; CHECK-NEXT: .cfi_offset %rbx, -48
-; CHECK-NEXT: .cfi_offset %r12, -40
-; CHECK-NEXT: .cfi_offset %r13, -32
-; CHECK-NEXT: .cfi_offset %r14, -24
-; CHECK-NEXT: .cfi_offset %r15, -16
-; CHECK-NEXT: movq %rdi, %r12
-; CHECK-NEXT: callq callee@PLT
-; CHECK-NEXT: popq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: popq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: popq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: popq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: popq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 8
-; CHECK-NEXT: retq
+; X64-LABEL: caller1:
+; X64: # %bb.0:
+; X64-NEXT: pushq %r15
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: pushq %r14
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: pushq %r13
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: pushq %r12
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: pushq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 48
+; X64-NEXT: .cfi_offset %rbx, -48
+; X64-NEXT: .cfi_offset %r12, -40
+; X64-NEXT: .cfi_offset %r13, -32
+; X64-NEXT: .cfi_offset %r14, -24
+; X64-NEXT: .cfi_offset %r15, -16
+; X64-NEXT: movq %rdi, %r12
+; X64-NEXT: callq callee@PLT
+; X64-NEXT: popq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: popq %r12
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: popq %r13
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: popq %r14
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: popq %r15
+; X64-NEXT: .cfi_def_cfa_offset 8
+; X64-NEXT: retq
+;
+; X86-LABEL: caller1:
+; X86: # %bb.0:
+; X86-NEXT: pushl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: pushl %edi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: pushl %esi
+; X86-NEXT: .cfi_def_cfa_offset 16
+; X86-NEXT: .cfi_offset %esi, -16
+; X86-NEXT: .cfi_offset %edi, -12
+; X86-NEXT: .cfi_offset %ebx, -8
+; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
+; X86-NEXT: calll callee@PLT
+; X86-NEXT: popl %esi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: popl %edi
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: popl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 4
+; X86-NEXT: retl
tail call preserve_nonecc void @callee(ptr %a)
ret void
}
@@ -48,98 +70,345 @@ define void @caller1(ptr %a) {
; The tail call is preserved. No registers are saved/restored around the call.
; Actually a simple jmp instruction is generated.
define preserve_nonecc void @caller2(ptr %a) {
-; CHECK-LABEL: caller2:
-; CHECK: # %bb.0:
-; CHECK-NEXT: jmp callee@PLT # TAILCALL
+; X64-LABEL: caller2:
+; X64: # %bb.0:
+; X64-NEXT: jmp callee@PLT # TAILCALL
+;
+; X86-LABEL: caller2:
+; X86: # %bb.0:
+; X86-NEXT: jmp callee@PLT # TAILCALL
tail call preserve_nonecc void @callee(ptr %a)
ret void
}
; Preserve_none function can use more registers to pass parameters.
-declare preserve_nonecc i64 @callee_with_many_param2(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
-define preserve_nonecc i64 @callee_with_many_param(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
-; CHECK-LABEL: callee_with_many_param:
-; CHECK: # %bb.0:
-; CHECK-NEXT: pushq %rax
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: movq %r13, %r12
-; CHECK-NEXT: movq %r14, %r13
-; CHECK-NEXT: movq %r15, %r14
-; CHECK-NEXT: movq %rdi, %r15
-; CHECK-NEXT: movq %rsi, %rdi
-; CHECK-NEXT: movq %rdx, %rsi
-; CHECK-NEXT: movq %rcx, %rdx
-; CHECK-NEXT: movq %r8, %rcx
-; CHECK-NEXT: movq %r9, %r8
-; CHECK-NEXT: movq %r11, %r9
-; CHECK-NEXT: movq %rax, %r11
-; CHECK-NEXT: callq callee_with_many_param2@PLT
-; CHECK-NEXT: popq %rcx
-; CHECK-NEXT: .cfi_def_cfa_offset 8
-; CHECK-NEXT: retq
- %ret = call preserve_nonecc i64 @callee_with_many_param2(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
+declare preserve_nonecc i64 @callee_with_11_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
+define preserve_nonecc i64 @callee_with_12_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
+; X64-LABEL: callee_with_12_params:
+; X64: # %bb.0:
+; X64-NEXT: pushq %rax
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: movq %r13, %r12
+; X64-NEXT: movq %r14, %r13
+; X64-NEXT: movq %r15, %r14
+; X64-NEXT: movq %rdi, %r15
+; X64-NEXT: movq %rsi, %rdi
+; X64-NEXT: movq %rdx, %rsi
+; X64-NEXT: movq %rcx, %rdx
+; X64-NEXT: movq %r8, %rcx
+; X64-NEXT: movq %r9, %r8
+; X64-NEXT: movq %r11, %r9
+; X64-NEXT: movq %rax, %r11
+; X64-NEXT: callq callee_with_11_params@PLT
+; X64-NEXT: popq %rcx
+; X64-NEXT: .cfi_def_cfa_offset 8
+; X64-NEXT: retq
+;
+; X86-LABEL: callee_with_12_params:
+; X86: # %bb.0:
+; X86-NEXT: pushl %ebp
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: .cfi_offset %ebp, -8
+; X86-NEXT: movl %edx, %edi
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
+; X86-NEXT: movl %ecx, %esi
+; X86-NEXT: movl %eax, %edx
+; X86-NEXT: movl %ebx, %ecx
+; X86-NEXT: movl %ebp, %eax
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: calll callee_with_11_params@PLT
+; X86-NEXT: addl $68, %esp
+; X86-NEXT: .cfi_adjust_cfa_offset -68
+; X86-NEXT: popl %ebp
+; X86-NEXT: .cfi_def_cfa_offset 4
+; X86-NEXT: retl
+ %ret = call preserve_nonecc i64 @callee_with_11_params(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
ret i64 %ret
}
define i64 @caller3() {
-; CHECK-LABEL: caller3:
-; CHECK: # %bb.0:
-; CHECK-NEXT: pushq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: pushq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: pushq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: pushq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: pushq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 48
-; CHECK-NEXT: .cfi_offset %rbx, -48
-; CHECK-NEXT: .cfi_offset %r12, -40
-; CHECK-NEXT: .cfi_offset %r13, -32
-; CHECK-NEXT: .cfi_offset %r14, -24
-; CHECK-NEXT: .cfi_offset %r15, -16
-; CHECK-NEXT: movl $1, %r12d
-; CHECK-NEXT: movl $2, %r13d
-; CHECK-NEXT: movl $3, %r14d
-; CHECK-NEXT: movl $4, %r15d
-; CHECK-NEXT: movl $5, %edi
-; CHECK-NEXT: movl $6, %esi
-; CHECK-NEXT: movl $7, %edx
-; CHECK-NEXT: movl $8, %ecx
-; CHECK-NEXT: movl $9, %r8d
-; CHECK-NEXT: movl $10, %r9d
-; CHECK-NEXT: movl $11, %r11d
-; CHECK-NEXT: movl $12, %eax
-; CHECK-NEXT: callq callee_with_many_param@PLT
-; CHECK-NEXT: popq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: popq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: popq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: popq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: popq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 8
-; CHECK-NEXT: retq
- %ret = call preserve_nonecc i64 @callee_with_many_param(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
+; X64-LABEL: caller3:
+; X64: # %bb.0:
+; X64-NEXT: pushq %r15
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: pushq %r14
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: pushq %r13
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: pushq %r12
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: pushq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 48
+; X64-NEXT: .cfi_offset %rbx, -48
+; X64-NEXT: .cfi_offset %r12, -40
+; X64-NEXT: .cfi_offset %r13, -32
+; X64-NEXT: .cfi_offset %r14, -24
+; X64-NEXT: .cfi_offset %r15, -16
+; X64-NEXT: movl $1, %r12d
+; X64-NEXT: movl $2, %r13d
+; X64-NEXT: movl $3, %r14d
+; X64-NEXT: movl $4, %r15d
+; X64-NEXT: movl $5, %edi
+; X64-NEXT: movl $6, %esi
+; X64-NEXT: movl $7, %edx
+; X64-NEXT: movl $8, %ecx
+; X64-NEXT: movl $9, %r8d
+; X64-NEXT: movl $10, %r9d
+; X64-NEXT: movl $11, %r11d
+; X64-NEXT: movl $12, %eax
+; X64-NEXT: callq callee_with_12_params@PLT
+; X64-NEXT: popq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: popq %r12
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: popq %r13
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: popq %r14
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: popq %r15
+; X64-NEXT: .cfi_def_cfa_offset 8
+; X64-NEXT: retq
+;
+; X86-LABEL: caller3:
+; X86: # %bb.0:
+; X86-NEXT: pushl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: pushl %edi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: pushl %esi
+; X86-NEXT: .cfi_def_cfa_offset 16
+; X86-NEXT: .cfi_offset %esi, -16
+; X86-NEXT: .cfi_offset %edi, -12
+; X86-NEXT: .cfi_offset %ebx, -8
+; X86-NEXT: movl $1, %edi
+; X86-NEXT: xorl %esi, %esi
+; X86-NEXT: movl $2, %edx
+; X86-NEXT: xorl %ecx, %ecx
+; X86-NEXT: movl $3, %eax
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $12
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $11
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $10
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $9
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $8
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $7
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $6
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $5
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $4
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: calll callee_with_12_params@PLT
+; X86-NEXT: addl $76, %esp
+; X86-NEXT: .cfi_adjust_cfa_offset -76
+; X86-NEXT: popl %esi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: popl %edi
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: popl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 4
+; X86-NEXT: retl
+ %ret = call preserve_nonecc i64 @callee_with_12_params(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
ret i64 %ret
}
+declare preserve_nonecc i32 @callee_with_4_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4)
+define preserve_nonecc i32 @callee_with_5_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5) {
+; X64-LABEL: callee_with_5_params:
+; X64: # %bb.0:
+; X64-NEXT: pushq %rax
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: movl %r13d, %r12d
+; X64-NEXT: movl %r14d, %r13d
+; X64-NEXT: movl %r15d, %r14d
+; X64-NEXT: movl %edi, %r15d
+; X64-NEXT: callq callee_with_4_params@PLT
+; X64-NEXT: popq %rcx
...
[truncated]
|
|
@llvm/pr-subscribers-llvm-ir Author: Brandt Bucher (brandtbucher) ChangesCPython's experimental JIT compiler supports Patch is 28.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150106.diff 10 Files Affected:
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index fefdaba7f8bf5..bebd8cc16893e 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -6433,13 +6433,15 @@ experimental at this time.
def PreserveNoneDocs : Documentation {
let Category = DocCatCallingConvs;
let Content = [{
-On X86-64 and AArch64 targets, this attribute changes the calling convention of a function.
+On X86, X86-64, and AArch64 targets, this attribute changes the calling convention of a function.
The ``preserve_none`` calling convention tries to preserve as few general
registers as possible. So all general registers are caller saved registers. It
also uses more general registers to pass arguments. This attribute doesn't
impact floating-point registers. ``preserve_none``'s ABI is still unstable, and
may be changed in the future.
+- On X86, only ESP and EBP are preserved by the callee. Registers EDI, ESI, EDX,
+ ECX, and EAX now can be used to pass function arguments.
- On X86-64, only RSP and RBP are preserved by the callee.
Registers R12, R13, R14, R15, RDI, RSI, RDX, RCX, R8, R9, R11, and RAX now can
be used to pass function arguments. Floating-point registers (XMMs/YMMs) still
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index ebc59c92f4c24..1d36f56b65a55 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -406,6 +406,7 @@ class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
case CC_X86RegCall:
case CC_C:
case CC_PreserveMost:
+ case CC_PreserveNone:
case CC_Swift:
case CC_X86Pascal:
case CC_IntelOclBicc:
diff --git a/clang/test/Sema/preserve-none-call-conv.c b/clang/test/Sema/preserve-none-call-conv.c
index fc9463726e3f5..6b6c9957c2ba4 100644
--- a/clang/test/Sema/preserve-none-call-conv.c
+++ b/clang/test/Sema/preserve-none-call-conv.c
@@ -1,5 +1,6 @@
// RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify
// RUN: %clang_cc1 %s -fsyntax-only -triple aarch64-unknown-unknown -verify
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify
typedef void typedef_fun_t(int);
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 822e761444db7..fc9e37f1bfbc0 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -463,7 +463,7 @@ added in the future:
registers to pass arguments. This attribute doesn't impact non-general
purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
Non-general purpose registers still follow the standard C calling
- convention. Currently it is for x86_64 and AArch64 only.
+ convention. Currently it is for x86, x86_64, and AArch64 only.
"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
Clang generates an access function to access C++-style Thread Local Storage
(TLS). The access function generally has an entry block, an exit block and an
diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index f020e0b55141c..6a8599a6c7c17 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1051,6 +1051,12 @@ def CC_X86_64_Preserve_None : CallingConv<[
CCDelegateTo<CC_X86_64_C>
]>;
+def CC_X86_32_Preserve_None : CallingConv<[
+ // 32-bit variant of CC_X86_64_Preserve_None, above.
+ CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, EAX]>>,
+ CCDelegateTo<CC_X86_32_C>
+]>;
+
//===----------------------------------------------------------------------===//
// X86 Root Argument Calling Conventions
//===----------------------------------------------------------------------===//
@@ -1072,6 +1078,7 @@ def CC_X86_32 : CallingConv<[
CCIfCC<"CallingConv::X86_RegCall",
CCIfSubtarget<"isTargetWin32()", CCIfRegCallv4<CCDelegateTo<CC_X86_32_RegCallv4_Win>>>>,
CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,
+ CCIfCC<"CallingConv::PreserveNone", CCDelegateTo<CC_X86_32_Preserve_None>>,
// Otherwise, drop to normal X86-32 CC
CCDelegateTo<CC_X86_32_C>
@@ -1187,6 +1194,7 @@ def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX,
(sequence "K%u", 0, 7)),
(sequence "XMM%u", 0, 15))>;
def CSR_64_NoneRegs : CalleeSavedRegs<(add RBP)>;
+def CSR_32_NoneRegs : CalleeSavedRegs<(add EBP)>;
// Standard C + YMM6-15
def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 83b11eede829e..facad368c66d9 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -316,7 +316,7 @@ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
return CSR_64_RT_AllRegs_AVX_SaveList;
return CSR_64_RT_AllRegs_SaveList;
case CallingConv::PreserveNone:
- return CSR_64_NoneRegs_SaveList;
+ return Is64Bit ? CSR_64_NoneRegs_SaveList : CSR_32_NoneRegs_SaveList;
case CallingConv::CXX_FAST_TLS:
if (Is64Bit)
return MF->getInfo<X86MachineFunctionInfo>()->isSplitCSR() ?
@@ -444,7 +444,7 @@ X86RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
return CSR_64_RT_AllRegs_AVX_RegMask;
return CSR_64_RT_AllRegs_RegMask;
case CallingConv::PreserveNone:
- return CSR_64_NoneRegs_RegMask;
+ return Is64Bit ? CSR_64_NoneRegs_RegMask : CSR_32_NoneRegs_RegMask;
case CallingConv::CXX_FAST_TLS:
if (Is64Bit)
return CSR_64_TLS_Darwin_RegMask;
diff --git a/llvm/test/CodeGen/X86/preserve_none_swift.ll b/llvm/test/CodeGen/X86/preserve_none_swift.ll
index 9a1c15190c6a2..bc64ee3b54f60 100644
--- a/llvm/test/CodeGen/X86/preserve_none_swift.ll
+++ b/llvm/test/CodeGen/X86/preserve_none_swift.ll
@@ -1,4 +1,5 @@
; RUN: not llc -mtriple=x86_64 %s -o - 2>&1 | FileCheck %s
+; RUN: not llc -mtriple=i686 %s -o - 2>&1 | FileCheck %s
; Swift attributes should not be used with preserve_none.
diff --git a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
index 500ebb139811a..c3044b42be35b 100644
--- a/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
+++ b/llvm/test/CodeGen/X86/preserve_nonecc_call.ll
@@ -1,5 +1,6 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
-; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=corei7 < %s | FileCheck %s --check-prefixes=X64
+; RUN: llc -mtriple=i686-unknown-unknown < %s | FileCheck %s --check-prefixes=X86
; This test checks various function call behaviors between preserve_none and
; normal calling conventions.
@@ -10,36 +11,57 @@ declare preserve_nonecc void @callee(ptr)
; of incompatible calling convention. Callee saved registers are saved/restored
; around the call.
define void @caller1(ptr %a) {
-; CHECK-LABEL: caller1:
-; CHECK: # %bb.0:
-; CHECK-NEXT: pushq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: pushq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: pushq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: pushq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: pushq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 48
-; CHECK-NEXT: .cfi_offset %rbx, -48
-; CHECK-NEXT: .cfi_offset %r12, -40
-; CHECK-NEXT: .cfi_offset %r13, -32
-; CHECK-NEXT: .cfi_offset %r14, -24
-; CHECK-NEXT: .cfi_offset %r15, -16
-; CHECK-NEXT: movq %rdi, %r12
-; CHECK-NEXT: callq callee@PLT
-; CHECK-NEXT: popq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: popq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: popq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: popq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: popq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 8
-; CHECK-NEXT: retq
+; X64-LABEL: caller1:
+; X64: # %bb.0:
+; X64-NEXT: pushq %r15
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: pushq %r14
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: pushq %r13
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: pushq %r12
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: pushq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 48
+; X64-NEXT: .cfi_offset %rbx, -48
+; X64-NEXT: .cfi_offset %r12, -40
+; X64-NEXT: .cfi_offset %r13, -32
+; X64-NEXT: .cfi_offset %r14, -24
+; X64-NEXT: .cfi_offset %r15, -16
+; X64-NEXT: movq %rdi, %r12
+; X64-NEXT: callq callee@PLT
+; X64-NEXT: popq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: popq %r12
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: popq %r13
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: popq %r14
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: popq %r15
+; X64-NEXT: .cfi_def_cfa_offset 8
+; X64-NEXT: retq
+;
+; X86-LABEL: caller1:
+; X86: # %bb.0:
+; X86-NEXT: pushl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: pushl %edi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: pushl %esi
+; X86-NEXT: .cfi_def_cfa_offset 16
+; X86-NEXT: .cfi_offset %esi, -16
+; X86-NEXT: .cfi_offset %edi, -12
+; X86-NEXT: .cfi_offset %ebx, -8
+; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
+; X86-NEXT: calll callee@PLT
+; X86-NEXT: popl %esi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: popl %edi
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: popl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 4
+; X86-NEXT: retl
tail call preserve_nonecc void @callee(ptr %a)
ret void
}
@@ -48,98 +70,345 @@ define void @caller1(ptr %a) {
; The tail call is preserved. No registers are saved/restored around the call.
; Actually a simple jmp instruction is generated.
define preserve_nonecc void @caller2(ptr %a) {
-; CHECK-LABEL: caller2:
-; CHECK: # %bb.0:
-; CHECK-NEXT: jmp callee@PLT # TAILCALL
+; X64-LABEL: caller2:
+; X64: # %bb.0:
+; X64-NEXT: jmp callee@PLT # TAILCALL
+;
+; X86-LABEL: caller2:
+; X86: # %bb.0:
+; X86-NEXT: jmp callee@PLT # TAILCALL
tail call preserve_nonecc void @callee(ptr %a)
ret void
}
; Preserve_none function can use more registers to pass parameters.
-declare preserve_nonecc i64 @callee_with_many_param2(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
-define preserve_nonecc i64 @callee_with_many_param(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
-; CHECK-LABEL: callee_with_many_param:
-; CHECK: # %bb.0:
-; CHECK-NEXT: pushq %rax
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: movq %r13, %r12
-; CHECK-NEXT: movq %r14, %r13
-; CHECK-NEXT: movq %r15, %r14
-; CHECK-NEXT: movq %rdi, %r15
-; CHECK-NEXT: movq %rsi, %rdi
-; CHECK-NEXT: movq %rdx, %rsi
-; CHECK-NEXT: movq %rcx, %rdx
-; CHECK-NEXT: movq %r8, %rcx
-; CHECK-NEXT: movq %r9, %r8
-; CHECK-NEXT: movq %r11, %r9
-; CHECK-NEXT: movq %rax, %r11
-; CHECK-NEXT: callq callee_with_many_param2@PLT
-; CHECK-NEXT: popq %rcx
-; CHECK-NEXT: .cfi_def_cfa_offset 8
-; CHECK-NEXT: retq
- %ret = call preserve_nonecc i64 @callee_with_many_param2(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
+declare preserve_nonecc i64 @callee_with_11_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11)
+define preserve_nonecc i64 @callee_with_12_params(i64 %a1, i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12) {
+; X64-LABEL: callee_with_12_params:
+; X64: # %bb.0:
+; X64-NEXT: pushq %rax
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: movq %r13, %r12
+; X64-NEXT: movq %r14, %r13
+; X64-NEXT: movq %r15, %r14
+; X64-NEXT: movq %rdi, %r15
+; X64-NEXT: movq %rsi, %rdi
+; X64-NEXT: movq %rdx, %rsi
+; X64-NEXT: movq %rcx, %rdx
+; X64-NEXT: movq %r8, %rcx
+; X64-NEXT: movq %r9, %r8
+; X64-NEXT: movq %r11, %r9
+; X64-NEXT: movq %rax, %r11
+; X64-NEXT: callq callee_with_11_params@PLT
+; X64-NEXT: popq %rcx
+; X64-NEXT: .cfi_def_cfa_offset 8
+; X64-NEXT: retq
+;
+; X86-LABEL: callee_with_12_params:
+; X86: # %bb.0:
+; X86-NEXT: pushl %ebp
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: .cfi_offset %ebp, -8
+; X86-NEXT: movl %edx, %edi
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
+; X86-NEXT: movl %ecx, %esi
+; X86-NEXT: movl %eax, %edx
+; X86-NEXT: movl %ebx, %ecx
+; X86-NEXT: movl %ebp, %eax
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl {{[0-9]+}}(%esp)
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: calll callee_with_11_params@PLT
+; X86-NEXT: addl $68, %esp
+; X86-NEXT: .cfi_adjust_cfa_offset -68
+; X86-NEXT: popl %ebp
+; X86-NEXT: .cfi_def_cfa_offset 4
+; X86-NEXT: retl
+ %ret = call preserve_nonecc i64 @callee_with_11_params(i64 %a2, i64 %a3, i64 %a4, i64 %a5, i64 %a6, i64 %a7, i64 %a8, i64 %a9, i64 %a10, i64 %a11, i64 %a12)
ret i64 %ret
}
define i64 @caller3() {
-; CHECK-LABEL: caller3:
-; CHECK: # %bb.0:
-; CHECK-NEXT: pushq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: pushq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: pushq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: pushq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: pushq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 48
-; CHECK-NEXT: .cfi_offset %rbx, -48
-; CHECK-NEXT: .cfi_offset %r12, -40
-; CHECK-NEXT: .cfi_offset %r13, -32
-; CHECK-NEXT: .cfi_offset %r14, -24
-; CHECK-NEXT: .cfi_offset %r15, -16
-; CHECK-NEXT: movl $1, %r12d
-; CHECK-NEXT: movl $2, %r13d
-; CHECK-NEXT: movl $3, %r14d
-; CHECK-NEXT: movl $4, %r15d
-; CHECK-NEXT: movl $5, %edi
-; CHECK-NEXT: movl $6, %esi
-; CHECK-NEXT: movl $7, %edx
-; CHECK-NEXT: movl $8, %ecx
-; CHECK-NEXT: movl $9, %r8d
-; CHECK-NEXT: movl $10, %r9d
-; CHECK-NEXT: movl $11, %r11d
-; CHECK-NEXT: movl $12, %eax
-; CHECK-NEXT: callq callee_with_many_param@PLT
-; CHECK-NEXT: popq %rbx
-; CHECK-NEXT: .cfi_def_cfa_offset 40
-; CHECK-NEXT: popq %r12
-; CHECK-NEXT: .cfi_def_cfa_offset 32
-; CHECK-NEXT: popq %r13
-; CHECK-NEXT: .cfi_def_cfa_offset 24
-; CHECK-NEXT: popq %r14
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: popq %r15
-; CHECK-NEXT: .cfi_def_cfa_offset 8
-; CHECK-NEXT: retq
- %ret = call preserve_nonecc i64 @callee_with_many_param(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
+; X64-LABEL: caller3:
+; X64: # %bb.0:
+; X64-NEXT: pushq %r15
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: pushq %r14
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: pushq %r13
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: pushq %r12
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: pushq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 48
+; X64-NEXT: .cfi_offset %rbx, -48
+; X64-NEXT: .cfi_offset %r12, -40
+; X64-NEXT: .cfi_offset %r13, -32
+; X64-NEXT: .cfi_offset %r14, -24
+; X64-NEXT: .cfi_offset %r15, -16
+; X64-NEXT: movl $1, %r12d
+; X64-NEXT: movl $2, %r13d
+; X64-NEXT: movl $3, %r14d
+; X64-NEXT: movl $4, %r15d
+; X64-NEXT: movl $5, %edi
+; X64-NEXT: movl $6, %esi
+; X64-NEXT: movl $7, %edx
+; X64-NEXT: movl $8, %ecx
+; X64-NEXT: movl $9, %r8d
+; X64-NEXT: movl $10, %r9d
+; X64-NEXT: movl $11, %r11d
+; X64-NEXT: movl $12, %eax
+; X64-NEXT: callq callee_with_12_params@PLT
+; X64-NEXT: popq %rbx
+; X64-NEXT: .cfi_def_cfa_offset 40
+; X64-NEXT: popq %r12
+; X64-NEXT: .cfi_def_cfa_offset 32
+; X64-NEXT: popq %r13
+; X64-NEXT: .cfi_def_cfa_offset 24
+; X64-NEXT: popq %r14
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: popq %r15
+; X64-NEXT: .cfi_def_cfa_offset 8
+; X64-NEXT: retq
+;
+; X86-LABEL: caller3:
+; X86: # %bb.0:
+; X86-NEXT: pushl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: pushl %edi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: pushl %esi
+; X86-NEXT: .cfi_def_cfa_offset 16
+; X86-NEXT: .cfi_offset %esi, -16
+; X86-NEXT: .cfi_offset %edi, -12
+; X86-NEXT: .cfi_offset %ebx, -8
+; X86-NEXT: movl $1, %edi
+; X86-NEXT: xorl %esi, %esi
+; X86-NEXT: movl $2, %edx
+; X86-NEXT: xorl %ecx, %ecx
+; X86-NEXT: movl $3, %eax
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $12
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $11
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $10
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $9
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $8
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $7
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $6
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $5
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $4
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: pushl $0
+; X86-NEXT: .cfi_adjust_cfa_offset 4
+; X86-NEXT: calll callee_with_12_params@PLT
+; X86-NEXT: addl $76, %esp
+; X86-NEXT: .cfi_adjust_cfa_offset -76
+; X86-NEXT: popl %esi
+; X86-NEXT: .cfi_def_cfa_offset 12
+; X86-NEXT: popl %edi
+; X86-NEXT: .cfi_def_cfa_offset 8
+; X86-NEXT: popl %ebx
+; X86-NEXT: .cfi_def_cfa_offset 4
+; X86-NEXT: retl
+ %ret = call preserve_nonecc i64 @callee_with_12_params(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12)
ret i64 %ret
}
+declare preserve_nonecc i32 @callee_with_4_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4)
+define preserve_nonecc i32 @callee_with_5_params(i32 %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5) {
+; X64-LABEL: callee_with_5_params:
+; X64: # %bb.0:
+; X64-NEXT: pushq %rax
+; X64-NEXT: .cfi_def_cfa_offset 16
+; X64-NEXT: movl %r13d, %r12d
+; X64-NEXT: movl %r14d, %r13d
+; X64-NEXT: movl %r15d, %r14d
+; X64-NEXT: movl %edi, %r15d
+; X64-NEXT: callq callee_with_4_params@PLT
+; X64-NEXT: popq %rcx
...
[truncated]
|
efriedma-quic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this interact with functions that need a base pointer (due to, for example, having both stack realignment and an alloca)? The X86ArgumentStackSlotRebase pass exists, but I don't think you're triggering it.
|
Thanks @efriedma-quic, I missed that the base pointer is How's this for the new parameter list? diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index 6a8599a6c7c1..32eedcb9ca79 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1052,8 +1052,12 @@ def CC_X86_64_Preserve_None : CallingConv<[
]>;
def CC_X86_32_Preserve_None : CallingConv<[
- // 32-bit variant of CC_X86_64_Preserve_None, above.
- CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, EAX]>>,
+ // 32-bit variant of CC_X86_64_Preserve_None, above. Use everything except:
+ // - EBP frame pointer
+ // - ECX 'nest' parameter
+ // - ESI base pointer
+ // - EBX GOT pointer for PLT calls
+ CCIfType<[i32], CCAssignToReg<[EDI, EDX, EAX]>>,
CCDelegateTo<CC_X86_32_C>
]>; |
|
For "nest", we can forbid combining it with the preserves_none calling convention, probably, as long as we can detect it and error out. There's no reason anyone would combine the two. For the base pointer, you also need to worry about the callee-save register list: I don't think we have code to properly save/restore the base pointer if it gets clobbered by a call. |
|
Okay. Both of those concerns already apply to the existing 64-bit flavor, right? Would you prefer to see them addressed here, or in a dedicated follow-up?
Makes sense. Where do you think the best place to do this is?
Okay, this is just adding them to the existing |
|
How about we add the base pointers to both callee-save lists here and leave the nest parameter as a future improvement for another PR? The calling convention is ABI-unstable, so we can always tweak it later. |
Yes.
I think similar sorts of diagnostics are in llvm/lib/Target/X86/X86ISelLoweringCall.cpp . (Grep for errorUnsupported.)
That's probably fine. |
…inter for arguments
efriedma-quic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'd like a second set of eyes on this.
Maybe @weiguozhi, the author of 64-bit |
Function X86FrameLowering::spillFPBP can do this for clobbered base pointer and frame pointer. |
So the diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index 9e5aaeb44334..32eedcb9ca79 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1197,8 +1197,8 @@ def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX,
(sequence "ZMM%u", 0, 31),
(sequence "K%u", 0, 7)),
(sequence "XMM%u", 0, 15))>;
-def CSR_64_NoneRegs : CalleeSavedRegs<(add RBP, RBX)>;
-def CSR_32_NoneRegs : CalleeSavedRegs<(add EBP, ESI)>;
+def CSR_64_NoneRegs : CalleeSavedRegs<(add RBP)>;
+def CSR_32_NoneRegs : CalleeSavedRegs<(add EBP)>;
// Standard C + YMM6-15
def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12, |
Yes, I think so. |
|
If you can write a testcase that shows we properly spill the base pointer when necessary, fine. (I didn't realize there was already code to deal with this sort of thing.) |
|
Looks like it is indeed being spilled, even if it's not in the callee-saved regs: define i8 @caller_with_base_pointer(i32 %n) alignstack(32) {
; X64-LABEL: caller_with_base_pointer:
; X64: # %bb.0:
; X64-NEXT: pushq %rbp
; X64-NEXT: .cfi_def_cfa_offset 16
; X64-NEXT: .cfi_offset %rbp, -16
; X64-NEXT: movq %rsp, %rbp
; X64-NEXT: .cfi_def_cfa_register %rbp
; X64-NEXT: pushq %r15
; X64-NEXT: pushq %r14
; X64-NEXT: pushq %r13
; X64-NEXT: pushq %r12
; X64-NEXT: pushq %rbx
; X64-NEXT: andq $-32, %rsp
; X64-NEXT: subq $64, %rsp
; X64-NEXT: movq %rsp, %rbx
; X64-NEXT: .cfi_offset %rbx, -56
; X64-NEXT: .cfi_offset %r12, -48
; X64-NEXT: .cfi_offset %r13, -40
; X64-NEXT: .cfi_offset %r14, -32
; X64-NEXT: .cfi_offset %r15, -24
; X64-NEXT: movq %rsp, %r12
; X64-NEXT: movq %r12, 32(%rbx) # 8-byte Spill
; X64-NEXT: movl %edi, %eax
; X64-NEXT: addq $15, %rax
; X64-NEXT: andq $-16, %rax
; X64-NEXT: subq %rax, %r12
; X64-NEXT: movq %r12, %rsp
; X64-NEXT: negq %rax
; X64-NEXT: movq %rax, 24(%rbx) # 8-byte Spill
; X64-NEXT: pushq %rbx
; X64-NEXT: pushq %rax
; X64-NEXT: callq callee@PLT
; X64-NEXT: addq $8, %rsp
; X64-NEXT: popq %rbx
; X64-NEXT: movq 32(%rbx), %rax # 8-byte Reload
; X64-NEXT: movq 24(%rbx), %rcx # 8-byte Reload
; X64-NEXT: movzbl (%rax,%rcx), %eax
; X64-NEXT: leaq -40(%rbp), %rsp
; X64-NEXT: popq %rbx
; X64-NEXT: popq %r12
; X64-NEXT: popq %r13
; X64-NEXT: popq %r14
; X64-NEXT: popq %r15
; X64-NEXT: popq %rbp
; X64-NEXT: .cfi_def_cfa %rsp, 8
; X64-NEXT: retq
;
; X86-LABEL: caller_with_base_pointer:
; X86: # %bb.0:
; X86-NEXT: pushl %ebp
; X86-NEXT: .cfi_def_cfa_offset 8
; X86-NEXT: .cfi_offset %ebp, -8
; X86-NEXT: movl %esp, %ebp
; X86-NEXT: .cfi_def_cfa_register %ebp
; X86-NEXT: pushl %ebx
; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi
; X86-NEXT: andl $-32, %esp
; X86-NEXT: subl $32, %esp
; X86-NEXT: movl %esp, %esi
; X86-NEXT: .cfi_offset %esi, -20
; X86-NEXT: .cfi_offset %edi, -16
; X86-NEXT: .cfi_offset %ebx, -12
; X86-NEXT: movl 8(%ebp), %eax
; X86-NEXT: movl %esp, %edi
; X86-NEXT: movl %edi, 8(%esi) # 4-byte Spill
; X86-NEXT: addl $3, %eax
; X86-NEXT: andl $-4, %eax
; X86-NEXT: subl %eax, %edi
; X86-NEXT: movl %edi, %esp
; X86-NEXT: negl %eax
; X86-NEXT: movl %eax, 4(%esi) # 4-byte Spill
; X86-NEXT: pushl %esi
; X86-NEXT: calll callee@PLT
; X86-NEXT: popl %esi
; X86-NEXT: movl 8(%esi), %eax # 4-byte Reload
; X86-NEXT: movl 4(%esi), %ecx # 4-byte Reload
; X86-NEXT: movzbl (%eax,%ecx), %eax
; X86-NEXT: leal -12(%ebp), %esp
; X86-NEXT: popl %esi
; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp
; X86-NEXT: .cfi_def_cfa %esp, 4
; X86-NEXT: retl
%a = alloca i8, i32 %n
call preserve_nonecc void @callee(ptr %a)
%r = load i8, ptr %a
ret i8 %r
}I'll add this test. |
CPython's experimental JIT compiler supports
i686-pc-windows-msvc, and uses Clang to generate machine code templates at interpreter build time.preserve_noneandmusttailare used for efficient dispatch between these templates, butpreserve_noneisn't available for this target (leading to worse performance).