Skip to content

[AMDGPU] No available SGPR for CSR spill stores #113782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shiltian opened this issue Oct 27, 2024 · 1 comment · Fixed by #127353
Closed

[AMDGPU] No available SGPR for CSR spill stores #113782

shiltian opened this issue Oct 27, 2024 · 1 comment · Fixed by #127353
Assignees

Comments

@shiltian
Copy link
Contributor

shiltian commented Oct 27, 2024

In emitCSRSpillStores, a caller-saved SGPR is required to save exec, which limits us to using SGPR0 through SGPR29. Currently, we assume that one is always available; however, this isn’t always the case, as SGPR0 to SGPR29 are also used for inreg argument passing.

https://github.com/llvm/llvm-project/blob/3b7d788ff22b8aa22634b927faf01da12bece496/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp#L966

If there are enough inreg arguments, no SGPR is available, leading to a compiler crash. The following example demonstrates this issue.

; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 %s -o -

declare hidden void @external_void_func_a15i32_inreg([16 x i32] inreg)

define void @test_call_external_void_func_a15i32_inreg([16 x i32] inreg %arg0) {
  call void @external_void_func_a15i32_inreg([16 x i32] inreg %arg0)
  ret void
}
@shiltian shiltian self-assigned this Oct 27, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2024

@llvm/issue-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

If there are enough number of SGPR arguments, they will be spilled into VGPRs. However, we don’t reserve one SGPR to handle `exec` for the spilling.

@shiltian shiltian changed the title [AMDGPU] No available SGPR for SGPR argument spilling [AMDGPU] No available SGPR for CSR spill stores Nov 4, 2024
shiltian added a commit that referenced this issue Nov 8, 2024
We’re facing an issue (#113782) that is currently blocking #112403. However,
since #112403 involves extensive test changes, I’d prefer to land it as soon as
possible. This PR reorganizes the tests by moving test cases expected to fail
into a separate file. Additionally, it changes the `[15 x i32]` arguments to
`[13 x i32]` to bypass the issue.
shiltian added a commit that referenced this issue Nov 8, 2024
We’re facing an issue (#113782) that is currently blocking #112403. However,
since #112403 involves extensive test changes, I’d prefer to land it as soon as
possible. This PR reorganizes the tests by moving test cases expected to fail
into a separate file. Additionally, it changes the `[15 x i32]` arguments to
`[13 x i32]` to bypass the issue.
shiltian added a commit that referenced this issue Nov 8, 2024
We’re facing an issue (#113782) that is currently blocking #112403. However,
since #112403 involves extensive test changes, I’d prefer to land it as soon as
possible. This PR reorganizes the tests by moving test cases expected to fail
into a separate file. Additionally, it changes the `[15 x i32]` arguments to
`[13 x i32]` to bypass the issue.
shiltian added a commit that referenced this issue Nov 8, 2024
We’re facing an issue (#113782) that is currently blocking #112403. However,
since #112403 involves extensive test changes, I’d prefer to land it as soon as
possible. This PR reorganizes the tests by moving test cases expected to fail
into a separate file. Additionally, it changes the `[15 x i32]` arguments to
`[13 x i32]` to bypass the issue.
shiltian added a commit that referenced this issue Nov 11, 2024
… `inreg` argument passing

In `emitCSRSpillStores`, a caller-saved SGPR is required to save `exec`, which
limits us to using SGPR0 through SGPR29
Currently, we assume that one is always available; however, this isn’t always
the case, as SGPR0 to SGPR29 are also used for inreg argument passing.

This PR is trying to fix this issue by not using all caller-saved SGPRs for
`inreg` argument passing. This will make sure that we will always have at least
two SGPRs available when it needs CSR spilling.

Fixes #113782.
Groverkss pushed a commit to iree-org/llvm-project that referenced this issue Nov 15, 2024
We’re facing an issue (llvm#113782) that is currently blocking llvm#112403. However,
since llvm#112403 involves extensive test changes, I’d prefer to land it as soon as
possible. This PR reorganizes the tests by moving test cases expected to fail
into a separate file. Additionally, it changes the `[15 x i32]` arguments to
`[13 x i32]` to bypass the issue.
shiltian added a commit that referenced this issue Feb 15, 2025
This PR updates the SGPR layout to a striped caller/callee-saved design, similar
to the VGPR layout. The stripe width is set to 8.

Fixes #113782.
shiltian added a commit that referenced this issue Feb 17, 2025
This PR updates the SGPR layout to a striped caller/callee-saved design, similar
to the VGPR layout. The stripe width is set to 8.

Fixes #113782.
shiltian added a commit that referenced this issue Feb 18, 2025
This PR updates the SGPR layout to a striped caller/callee-saved design, similar
to the VGPR layout. The stripe width is set to 8.

Fixes #113782.
shiltian added a commit that referenced this issue Mar 8, 2025
This PR updates the SGPR layout to a striped caller/callee-saved design, similar
to the VGPR layout. The stripe width is set to 8.

Fixes #113782.
shiltian added a commit that referenced this issue Mar 8, 2025
This PR updates the SGPR layout to a striped caller/callee-saved design,
similar
to the VGPR layout.

To ensure that s30-s31 (return address), s32 (stack pointer), s33 (frame
pointer), and s34 (base pointer) remain callee-saved, the striped layout
starts
from s40, with a stripe width of 8. The last stripe is 10 wide instead
of 8 to
avoid ending with a 2-wide stripe.

Fixes #113782.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this issue Mar 20, 2025
This PR updates the SGPR layout to a striped caller/callee-saved design,
similar to the VGPR layout.

To ensure that s30-s31 (return address), s32 (stack pointer), s33 (frame
pointer), and s34 (base pointer) remain callee-saved, the striped layout
starts from s40, with a stripe width of 8. The last stripe is 10 wide instead
of 8 to avoid ending with a 2-wide stripe.

Fixes llvm#113782.

(cherry picked from commit a779af3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment