Skip to content

[AMDGPU][Verifier] Mark calls to entry functions as invalid in the IR verifier #134910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Apr 8, 2025

For AMDGPU, calls to entry functions are invalid. Previously, due to certain
limitations, this restriction was not enforced by the IR verifier. These
limitations have now been resolved, enabling us to enforce this check.

Copy link
Contributor Author

shiltian commented Apr 8, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@shiltian shiltian requested a review from arsenm April 8, 2025 19:18
@llvmbot
Copy link
Member

llvmbot commented Apr 8, 2025

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-ir

Author: Shilei Tian (shiltian)

Changes

For AMDGPU, calls to entry functions are invalid. Previously, due to certain
limitations, this restriction was not enforced by the IR verifier. These
limitations have now been resolved, enabling us to enforce this check.

Adding target-dependent checks directly into the IR verifier is not ideal.
However, a cleaner solution, such as a dedicated target-dependent IR verifier,
is underway (e.g., #123609). Once that
or similar code is merged, we can move this check accordingly.


Full diff: https://github.com/llvm/llvm-project/pull/134910.diff

4 Files Affected:

  • (modified) llvm/lib/IR/Verifier.cpp (+20-1)
  • (modified) llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll (-15)
  • (removed) llvm/test/CodeGen/AMDGPU/call-to-kernel-undefined.ll (-20)
  • (modified) llvm/test/CodeGen/AMDGPU/call-to-kernel.ll (+95-8)
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 7423e746dfa9a..a8d2f817efe06 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -3765,9 +3765,28 @@ void Verifier::visitCallBase(CallBase &Call) {
           "Return type cannot be x86_amx for indirect call!");
   }
 
-  if (Function *F = Call.getCalledFunction())
+  if (Function *F = Call.getCalledFunction()) {
     if (Intrinsic::ID ID = (Intrinsic::ID)F->getIntrinsicID())
       visitIntrinsicCall(ID, Call);
+    // TODO: Move this to target dependent IR verifier once we have it.
+    auto IsAMDGPUEntryFunctionCC = [](CallingConv::ID CC) -> bool {
+      switch (CC) {
+      case CallingConv::AMDGPU_KERNEL:
+      case CallingConv::AMDGPU_VS:
+      case CallingConv::AMDGPU_GS:
+      case CallingConv::AMDGPU_PS:
+      case CallingConv::AMDGPU_CS:
+      case CallingConv::AMDGPU_ES:
+      case CallingConv::AMDGPU_HS:
+      case CallingConv::AMDGPU_LS:
+        return true;
+      default:
+        return false;
+      }
+    };
+    Check(!IsAMDGPUEntryFunctionCC(F->getCallingConv()),
+          "Call to amdgpu entry function is not allowed");
+  }
 
   // Verify that a callsite has at most one "deopt", at most one "funclet", at
   // most one "gc-transition", at most one "cfguardtarget", at most one
diff --git a/llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll b/llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll
index fc61aa5f981dd..a3d6c3b729523 100644
--- a/llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll
+++ b/llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll
@@ -849,21 +849,6 @@ define amdgpu_kernel void @calls_intrin_ascast_cc_kernel(ptr addrspace(3) %ptr)
   ret void
 }
 
-define amdgpu_kernel void @call_calls_intrin_ascast_cc_kernel(ptr addrspace(3) %ptr) {
-; GFX9-LABEL: define amdgpu_kernel void @call_calls_intrin_ascast_cc_kernel(
-; GFX9-SAME: ptr addrspace(3) [[PTR:%.*]]) #[[ATTR1]] {
-; GFX9-NEXT:    call void @calls_intrin_ascast_cc_kernel(ptr addrspace(3) [[PTR]])
-; GFX9-NEXT:    ret void
-;
-; GFX10-LABEL: define amdgpu_kernel void @call_calls_intrin_ascast_cc_kernel(
-; GFX10-SAME: ptr addrspace(3) [[PTR:%.*]]) #[[ATTR1]] {
-; GFX10-NEXT:    call void @calls_intrin_ascast_cc_kernel(ptr addrspace(3) [[PTR]])
-; GFX10-NEXT:    ret void
-;
-  call void @calls_intrin_ascast_cc_kernel(ptr addrspace(3) %ptr)
-  ret void
-}
-
 define amdgpu_kernel void @with_inline_asm() {
 ; GFX9-LABEL: define amdgpu_kernel void @with_inline_asm(
 ; GFX9-SAME: ) #[[ATTR3]] {
diff --git a/llvm/test/CodeGen/AMDGPU/call-to-kernel-undefined.ll b/llvm/test/CodeGen/AMDGPU/call-to-kernel-undefined.ll
deleted file mode 100644
index da7385475088b..0000000000000
--- a/llvm/test/CodeGen/AMDGPU/call-to-kernel-undefined.ll
+++ /dev/null
@@ -1,20 +0,0 @@
-; RUN: not --crash llc -mtriple=amdgcn -mcpu=tahiti -verify-machineinstrs -o /dev/null %s 2>&1 | FileCheck %s
-
-; FIXME: It should be invalid IR to have a call to a kernel, but this
-; is currently relied on, but should be eliminated before codegen.
-define amdgpu_kernel void @callee_kernel(ptr addrspace(1) %out) #0 {
-entry:
-  store volatile i32 0, ptr addrspace(1) %out
-  ret void
-}
-
-; Make sure there's no crash when the callsite calling convention
-; doesn't match.
-; CHECK: LLVM ERROR: invalid call to entry function
-define amdgpu_kernel void @caller_kernel(ptr addrspace(1) %out) #0 {
-entry:
-  call void @callee_kernel(ptr addrspace(1) %out)
-  ret void
-}
-
-attributes #0 = { nounwind noinline }
diff --git a/llvm/test/CodeGen/AMDGPU/call-to-kernel.ll b/llvm/test/CodeGen/AMDGPU/call-to-kernel.ll
index 1f4f6471fcdba..7463d23a81af7 100644
--- a/llvm/test/CodeGen/AMDGPU/call-to-kernel.ll
+++ b/llvm/test/CodeGen/AMDGPU/call-to-kernel.ll
@@ -1,18 +1,105 @@
-; RUN: not --crash llc -mtriple=amdgcn -mcpu=tahiti -verify-machineinstrs -o /dev/null %s 2>&1 | FileCheck %s
+; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
 
-; FIXME: It should be invalid IR to have a call to a kernel, but this
-; is currently relied on, but should be eliminated before codegen.
-define amdgpu_kernel void @callee_kernel(ptr addrspace(1) %out) #0 {
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_kernel void @callee_kernel(ptr addrspace(1) %out) {
 entry:
   store volatile i32 0, ptr addrspace(1) %out
   ret void
 }
 
-; CHECK: LLVM ERROR: Unsupported calling convention for call
-define amdgpu_kernel void @caller_kernel(ptr addrspace(1) %out) #0 {
+define amdgpu_kernel void @caller_kernel(ptr addrspace(1) %out) {
 entry:
-  call amdgpu_kernel void @callee_kernel(ptr addrspace(1) %out)
+  call void @callee_kernel(ptr addrspace(1) %out)
   ret void
 }
 
-attributes #0 = { nounwind noinline }
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_vs void @callee_vs(ptr addrspace(1) inreg %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_vs void @caller_vs(ptr addrspace(1) inreg %out) {
+entry:
+  call void @callee_vs(ptr addrspace(1) %out)
+  ret void
+}
+
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_gs void @callee_gs(ptr addrspace(1) %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_gs void @caller_gs(ptr addrspace(1) %out) {
+entry:
+  call void @callee_gs(ptr addrspace(1) %out)
+  ret void
+}
+
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_ps void @callee_ps(ptr addrspace(1) %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_ps void @caller_ps(ptr addrspace(1) %out) {
+entry:
+  call void @callee_ps(ptr addrspace(1) %out)
+  ret void
+}
+
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_cs void @callee_cs(ptr addrspace(1) %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_cs void @caller_cs(ptr addrspace(1) %out) {
+entry:
+  call void @callee_cs(ptr addrspace(1) %out)
+  ret void
+}
+
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_es void @callee_es(ptr addrspace(1) %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_es void @caller_es(ptr addrspace(1) %out) {
+entry:
+  call void @callee_es(ptr addrspace(1) %out)
+  ret void
+}
+
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_hs void @callee_hs(ptr addrspace(1) %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_hs void @caller_hs(ptr addrspace(1) %out) {
+entry:
+  call void @callee_hs(ptr addrspace(1) %out)
+  ret void
+}
+
+; CHECK: Call to amdgpu entry function is not allowed
+define amdgpu_ls void @callee_ls(ptr addrspace(1) %out) {
+entry:
+  store volatile i32 0, ptr addrspace(1) %out
+  ret void
+}
+
+define amdgpu_ls void @caller_ls(ptr addrspace(1) %out) {
+entry:
+  call void @callee_ls(ptr addrspace(1) %out)
+  ret void
+}

@shiltian shiltian requested review from jayfoad and jofrn April 8, 2025 19:18
@shiltian shiltian force-pushed the users/shiltian/dont-allow-call-to-amdgpu-kernel branch from dc71454 to 3464b06 Compare April 9, 2025 04:34
@shiltian shiltian force-pushed the users/shiltian/dont-allow-call-to-amdgpu-kernel branch from 3464b06 to fab4fdd Compare April 9, 2025 14:02
case CallingConv::AMDGPU_LS:
case CallingConv::AMDGPU_PS:
case CallingConv::AMDGPU_VS:
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add CallingConv::SPIR_KERNEL here since its not supported as well : https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp#L1137

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are "valid" test case calling to SPIR-V kernel directly. I thought that is allowed by the SPIR-V spec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be allowed. Also SPIR_KERNEL actually came from spir classic and SPIR-V just started picking it up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"– No function can be targeted by both an OpEntryPoint instruction and an OpFunctionCall instruction."

Copy link
Contributor

@MrSidims MrSidims Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean to disagree. SPIR_KERNEL can be included conditionally. If triple is spirv, then sure, validation rule makes sense. But for spir triple it does look like a breaking change, as there is no such restriction in OpenCL or SPIR specification. Tagging @svenvh @michalpaszkowski @VyacheslavLevytskyy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for spir triple it does look like a breaking change, as there is no such restriction in OpenCL or SPIR specification

+1, this is valid in these conditions

case CallingConv::AMDGPU_LS:
case CallingConv::AMDGPU_PS:
case CallingConv::AMDGPU_VS:
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be allowed. Also SPIR_KERNEL actually came from spir classic and SPIR-V just started picking it up.

case CallingConv::AMDGPU_LS:
case CallingConv::AMDGPU_PS:
case CallingConv::AMDGPU_VS:
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"– No function can be targeted by both an OpEntryPoint instruction and an OpFunctionCall instruction."

@shiltian shiltian force-pushed the users/shiltian/dont-allow-call-to-amdgpu-kernel branch 2 times, most recently from be71146 to 8519821 Compare April 10, 2025 17:16
… verifier

For AMDGPU, calls to entry functions are invalid. Previously, due to certain
limitations, this restriction was not enforced by the IR verifier. These
limitations have now been resolved, enabling us to enforce this check.

Adding target-dependent checks directly into the IR verifier is not ideal.
However, a cleaner solution, such as a dedicated target-dependent IR verifier,
is underway (e.g., #123609). Once that
or similar code is merged, we can move this check accordingly.
@shiltian shiltian force-pushed the users/shiltian/dont-allow-call-to-amdgpu-kernel branch from 8519821 to 86407ad Compare April 11, 2025 17:14
@shiltian shiltian merged commit a45b133 into main Apr 11, 2025
11 checks passed
@shiltian shiltian deleted the users/shiltian/dont-allow-call-to-amdgpu-kernel branch April 11, 2025 19:32
bcardosolopes added a commit to bcardosolopes/llvm-project that referenced this pull request Apr 11, 2025
* origin/main: (287 commits)
  [Sema] On Windows, silence erroneous warning when building with MSVC
  [lldb][lldbp-dap] On Windoows, silence warnings when building with MSVC
  [lldb] Fix erroneous return value
  [compiler-rt] On Windows, silence warning when building with Clang ToT
  [clang][unittests] On Windows, silence warning when building with MSVC
  [lldb] On Windows, silence warning when building with Clang ToT
  [CIR] Make LLVM & OGCG variables match the same pattern (llvm#135427)
  [mlir][SMT] upstream `SMT` dialect (llvm#131480)
  [clang] fix serialization for SubstNonTypeTemplateParmPackExpr (llvm#135428)
  [flang][openacc] Allow if_present multiple times on host_data and update (llvm#135422)
  [flang][openacc] Allow finalize clause on exit data more than once (llvm#135415)
  [flang] IEEE_SCALB and SCALE - kind=2, kind=3 (llvm#135374)
  [-Wunsafe-buffer-usage] Add findUnsafePointers (llvm#135421)
  [compiler-rt][sanitizer] add Haiku support (llvm#134772)
  [cpp23] Remove usage of std::aligned_union<> in llvm (llvm#135146)
  [mlir][tosa] Add error_if checks for Mul Op (llvm#135075)
  [VPlan] Merge cases using getResultType in inferScalarType (NFC).
  [flang][runtime] Fix recently broken big-endian formatted integer input (llvm#135417)
  [AMDGPU][Verifier] Mark calls to entry functions as invalid in the IR verifier (llvm#134910)
  [llvm][Hexagon] Promote operand v2i1 to v2i32 (llvm#135409)
  ...
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch breaks tests for non AMDGPU code, see https://github.com/KhronosGroup/SPIRV-LLVM-Translator/actions/runs/14434681816/job/40473808013

From OpenCL POW it's legal for a kernel to call another kernel.

@MrSidims MrSidims added the SPIR-V SPIR-V language support label Apr 14, 2025
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
@arsenm
Copy link
Contributor

arsenm commented Apr 17, 2025

This patch breaks tests for non AMDGPU code, see https://github.com/KhronosGroup/SPIRV-LLVM-Translator/actions/runs/14434681816/job/40473808013

From OpenCL POW it's legal for a kernel to call another kernel.

Which is entirely a clang problem and now correctly implemented in 642481a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:ir SPIR-V SPIR-V language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants