-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Update preserve_most to treat XMM registers like C #73866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-ir Author: Seth Brenith (sethbrenith) ChangesAs scottmcm described, the This is a breaking change, but the documentation indicates that Full diff: https://github.com/llvm/llvm-project/pull/73866.diff 4 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index e448c5ed5c5d947..29257933cc62b5b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -362,8 +362,9 @@ added in the future:
- On X86-64 the callee preserves all general purpose registers, except for
R11 and return registers, if any. R11 can be used as a scratch register.
- Floating-point registers (XMMs/YMMs) are not preserved and need to be
- saved by the caller.
+ The treatment of floating-point registers (XMMs/YMMs) matches the OS's C
+ calling convention: on most platforms, they are not preserved and need to
+ be saved by the caller, but on Windows, xmm6-xmm15 are preserved.
- On AArch64 the callee preserve all general purpose registers, except X0-X8
and X16-X18.
diff --git a/llvm/lib/Target/X86/X86CallingConv.td b/llvm/lib/Target/X86/X86CallingConv.td
index 27e4fe0cb7a02fa..16014d6a2f60241 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -1151,6 +1151,9 @@ def CSR_64_CXX_TLS_Darwin_ViaCopy : CalleeSavedRegs<(sub CSR_64_TLS_Darwin, RBP)
def CSR_64_RT_MostRegs : CalleeSavedRegs<(add CSR_64, RAX, RCX, RDX, RSI, RDI,
R8, R9, R10)>;
+def CSR_Win64_RT_MostRegs : CalleeSavedRegs<(add CSR_64_RT_MostRegs,
+ (sequence "XMM%u", 6, 15))>;
+
// All registers - except r11 and return registers.
def CSR_64_RT_AllRegs : CalleeSavedRegs<(add CSR_64_RT_MostRegs,
(sequence "XMM%u", 0, 15))>;
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 379a9d448a963ca..8e162d40c2b0efa 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -310,7 +310,8 @@ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
return CSR_64_AllRegs_AVX_SaveList;
return CSR_64_AllRegs_SaveList;
case CallingConv::PreserveMost:
- return CSR_64_RT_MostRegs_SaveList;
+ return IsWin64 ? CSR_Win64_RT_MostRegs_SaveList :
+ CSR_64_RT_MostRegs_SaveList;
case CallingConv::PreserveAll:
if (HasAVX)
return CSR_64_RT_AllRegs_AVX_SaveList;
@@ -431,7 +432,7 @@ X86RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
return CSR_64_AllRegs_AVX_RegMask;
return CSR_64_AllRegs_RegMask;
case CallingConv::PreserveMost:
- return CSR_64_RT_MostRegs_RegMask;
+ return IsWin64 ? CSR_Win64_RT_MostRegs_RegMask : CSR_64_RT_MostRegs_RegMask;
case CallingConv::PreserveAll:
if (HasAVX)
return CSR_64_RT_AllRegs_AVX_RegMask;
diff --git a/llvm/test/CodeGen/X86/preserve_mostcc64_win.ll b/llvm/test/CodeGen/X86/preserve_mostcc64_win.ll
new file mode 100644
index 000000000000000..7042a2e77da1b6a
--- /dev/null
+++ b/llvm/test/CodeGen/X86/preserve_mostcc64_win.ll
@@ -0,0 +1,91 @@
+; RUN: sed -e "s/RETTYPE/void/;s/RETVAL//" %s | llc -mtriple=x86_64-win32 -mcpu=corei7 | FileCheck --check-prefixes=ALL,VOID %s
+; RUN: sed -e "s/RETTYPE/i32/;s/RETVAL/undef/" %s | llc -mtriple=x86_64-win32 -mcpu=corei7 | FileCheck --check-prefixes=ALL,INT %s
+; RUN: sed -e "s/RETTYPE/\{i64\,i64\}/;s/RETVAL/undef/" %s | llc -mtriple=x86_64-win32 -mcpu=corei7 | FileCheck --check-prefixes=ALL,INT128 %s
+
+; Every GPR should be saved, except r11 and return registers.
+; XMM registers 6-15 should also be saved.
+define preserve_mostcc RETTYPE @preserve_mostcc1(i64, i64, double, double) nounwind {
+entry:
+;ALL-LABEL: preserve_mostcc1
+;ALL: pushq %r10
+;ALL-NEXT: pushq %r9
+;ALL-NEXT: pushq %r8
+;ALL-NEXT: pushq %rdi
+;ALL-NEXT: pushq %rsi
+;VOID-NEXT: pushq %rdx
+;INT-NEXT: pushq %rdx
+;INT128-NOT: pushq %rdx
+;ALL-NEXT: pushq %rcx
+;VOID-NEXT: pushq %rax
+;INT-NOT: pushq %rax
+;INT128-NOT: pushq %rax
+;ALL-NEXT: pushq %rbp
+;ALL-NEXT: pushq %r15
+;ALL-NEXT: pushq %r14
+;ALL-NEXT: pushq %r13
+;ALL-NEXT: pushq %r12
+;ALL-NEXT: pushq %rbx
+;ALL: movaps %xmm15
+;ALL-NEXT: movaps %xmm14
+;ALL-NEXT: movaps %xmm13
+;ALL-NEXT: movaps %xmm12
+;ALL-NEXT: movaps %xmm11
+;ALL-NEXT: movaps %xmm10
+;ALL-NEXT: movaps %xmm9
+;ALL-NEXT: movaps %xmm8
+;ALL-NEXT: movaps %xmm7
+;ALL-NEXT: movaps %xmm6
+;ALL-NOT: movaps %xmm5
+;ALL-NOT: movaps %xmm4
+;ALL-NOT: movaps %xmm3
+;ALL-NOT: movaps %xmm2
+;ALL-NOT: movaps %xmm1
+;ALL-NOT: movaps %xmm0
+;ALL-NOT: movaps {{.*}} %xmm0
+;ALL-NOT: movaps {{.*}} %xmm1
+;ALL-NOT: movaps {{.*}} %xmm2
+;ALL-NOT: movaps {{.*}} %xmm3
+;ALL-NOT: movaps {{.*}} %xmm4
+;ALL-NOT: movaps {{.*}} %xmm5
+;ALL: movaps {{.*}} %xmm6
+;ALL-NEXT: movaps {{.*}} %xmm7
+;ALL-NEXT: movaps {{.*}} %xmm8
+;ALL-NEXT: movaps {{.*}} %xmm9
+;ALL-NEXT: movaps {{.*}} %xmm10
+;ALL-NEXT: movaps {{.*}} %xmm11
+;ALL-NEXT: movaps {{.*}} %xmm12
+;ALL-NEXT: movaps {{.*}} %xmm13
+;ALL-NEXT: movaps {{.*}} %xmm14
+;ALL-NEXT: movaps {{.*}} %xmm15
+;ALL: popq %rbx
+;ALL-NEXT: popq %r12
+;ALL-NEXT: popq %r13
+;ALL-NEXT: popq %r14
+;ALL-NEXT: popq %r15
+;ALL-NEXT: popq %rbp
+;VOID-NEXT: popq %rax
+;INT-NOT: popq %rax
+;INT128-NOT: popq %rax
+;ALL-NEXT: popq %rcx
+;VOID-NEXT: popq %rdx
+;INT-NEXT: popq %rdx
+;INT128-NOT: popq %rdx
+;ALL-NEXT: popq %rsi
+;ALL-NEXT: popq %rdi
+;ALL-NEXT: popq %r8
+;ALL-NEXT: popq %r9
+;ALL-NEXT: popq %r10
+ call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{rbp},~{xmm0},~{xmm1},~{xmm2},~{xmm3},~{xmm4},~{xmm5},~{xmm6},~{xmm7},~{xmm8},~{xmm9},~{xmm10},~{xmm11},~{xmm12},~{xmm13},~{xmm14},~{xmm15}"()
+ ret RETTYPE RETVAL
+}
+
+; Make sure XMMs are not saved before the call
+declare preserve_mostcc RETTYPE @foo(i64, i64, double, double)
+define void @preserve_mostcc2() nounwind {
+entry:
+;ALL-LABEL: preserve_mostcc2
+;ALL-NOT: movaps
+;ALL-NOT: {{.*xmm[0-1,4-9].*}}
+ call preserve_mostcc RETTYPE @foo(i64 1, i64 2, double 3.0, double 4.0)
+ ret void
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
2383f93
to
4a3a38c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Thanks for reviewing, @rnk! Would you please advise on the next steps, since I'm unfamiliar with the code review process for LLVM? A few specific questions:
|
I'm comfortable approving and merging this pull request, so I went ahead and did that. For LLVM committers, I would generally prefer to let them be the ones to push the merge button, so I didn't do that earlier. |
As scottmcm described, the
preserve_most
calling convention, as currently implemented, is a bad fit for Windows on x64. The intent ofpreserve_most
is "to make the code in the caller as unintrusive as possible", butpreserve_most
causes the caller to spill and restore ten SIMD registers. It would be preferable to makepreserve_most
treat the XMM registers however the C calling convention does on the target operating system.This is a breaking change, but the documentation indicates that
preserve_most
is still experimental, so I believe that ABI compatibility is not yet a requirement.