Skip to content

Commit f49d280

Browse files
committed
[WIP][AMDGPU] Change CC_AMDGPU_Func to only use SGPR0 to SGPR27 for inreg argument passing
In `emitCSRSpillStores`, a caller-saved SGPR is required to save `exec`, which limits us to using SGPR0 through SGPR29 Currently, we assume that one is always available; however, this isn’t always the case, as SGPR0 to SGPR29 are also used for inreg argument passing. This PR is trying to fix this issue by not using all caller-saved SGPRs for `inreg` argument passing. This will make sure that we will always have at least two SGPRs available when it needs CSR spilling. Fixes #113782.
1 parent 0b07aae commit f49d280

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

llvm/docs/AMDGPUUsage.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16545,7 +16545,7 @@ On entry to a function:
1654516545
:ref:`amdgpu-amdhsa-kernel-prolog-m0`.
1654616546
4. The EXEC register is set to the lanes active on entry to the function.
1654716547
5. MODE register: *TBD*
16548-
6. VGPR0-31 and SGPR4-29 are used to pass function input arguments as described
16548+
6. VGPR0-31 and SGPR4-27 are used to pass function input arguments as described
1654916549
below.
1655016550
7. SGPR30-31 return address (RA). The code address that the function must
1655116551
return to when it completes. The value is undefined if the function is *no
@@ -16796,7 +16796,7 @@ The input and result arguments are assigned in order in the following manner:
1679616796
How are overly aligned structures allocated on the stack?
1679716797

1679816798
* SGPR arguments are assigned to consecutive SGPRs starting at SGPR0 up to
16799-
SGPR29.
16799+
SGPR27.
1680016800

1680116801
If there are more arguments than will fit in these registers, the remaining
1680216802
arguments are allocated on the stack in order on naturally aligned

llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ def CC_AMDGPU_Func : CallingConv<[
127127
CCIfType<[i8, i16], CCIfExtend<CCPromoteToType<i32>>>,
128128

129129
CCIfInReg<CCIfType<[f32, i32, f16, i16, v2i16, v2f16, bf16, v2bf16] , CCAssignToReg<
130-
!foreach(i, !range(0, 30), !cast<Register>("SGPR"#i)) // SGPR0-29
130+
!foreach(i, !range(0, 28), !cast<Register>("SGPR"#i)) // SGPR0-27
131131
>>>,
132132

133133
CCIfType<[i32, f32, i16, f16, v2i16, v2f16, i1, bf16, v2bf16], CCAssignToReg<

0 commit comments

Comments
 (0)