-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
The following LLVM IR:
; reduced.ll
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"
define ptr addrspace(1) @__ockl_dm_alloc(i1 %0) {
__ockl_wfany_i32.exit:
%1 = tail call i32 @llvm.amdgcn.ballot.i32(i1 %0)
br label %2
2: ; preds = %__ockl_wfany_i32.exit
%3 = tail call i32 @llvm.cttz.i32(i32 %1, i1 false)
ret ptr addrspace(1) null
}
; Function Attrs: convergent nocallback nofree nounwind willreturn memory(none)
declare i32 @llvm.amdgcn.ballot.i32(i1) #0
; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i32 @llvm.cttz.i32(i32, i1 immarg) #1
attributes #0 = { convergent nocallback nofree nounwind willreturn memory(none) }
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
Compiled with:
llc -O0 reduced.ll
yields ICE:
LLVM ERROR: Cannot select: 0x8fc9e20: i32 = SETCC 0x8fc9a30, Constant:i32<0>, setne:ch
0x8fc9a30: i32 = and # D:1 0x8fc9800, Constant:i32<1>
0x8fc9800: i32,ch = CopyFromReg # D:1 0x8f54110, Register:i32 %8
0x8fc9790: i32 = Register %8
0x8fc99c0: i32 = Constant<1>
0x8fc9d40: i32 = Constant<0>
In function: __ockl_dm_alloc
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /opt/compiler-explorer/clang-trunk/bin/llc -o /app/output.s -x86-asm-syntax=intel -O0 <source>
1. Running pass 'CallGraph Pass Manager' on module '<source>'.
2. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@__ockl_dm_alloc'
. . .
Worth noting that the ICE occurs at -O0, https://llc.godbolt.org/z/xhG4qrYWv as well as at -O1, https://llc.godbolt.org/z/WeMvT17bb.
However, it does not occur at -O2, https://llc.godbolt.org/z/WeMvT17bb
We can see that the basic block (BB) __ockl_wfany_i32.exit
is just terminated by an unconditional branch to the successor BB 2
. If I pretend I'm SimplifyCFG and fold these BBs myself then the ICE doesn't occur at -O0 or -O1, either, https://llc.godbolt.org/z/qTxa444sK
This is reduced LLVM IR (using llvm-reduce). For context, the original LLVM IR has been produced from the following (adding -v -save-temps
)
$ cat > my_test.cpp
int main() {
int* ptr;
#pragma omp target
{
ptr = new int();
}
}
$ CC -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx90a my_test.cpp
LLVM ERROR: Cannot select: t26: i32 = SETCC t25, Constant:i32<0>, setne:ch
t25: i32 = zero_extend # D:1 t2
t2: i1,ch = CopyFromReg # D:1 t0, Register:i1 %454
t1: i1 = Register %454
t8: i32 = Constant<0>
In function: __ockl_dm_alloc
. . .
The original llc
invocation from -v
was llc -O0 -mtriple=amdgcn-amd-amdhsa -disable-promote-alloca-to-lds -mcpu=gfx90a -amdgpu-dump-hsa-metadata
although only -O0
suffices to trigger the ICE (with the IR itself already containing target triple = "amdgcn-amd-amdhsa"
); cf. the Compiler Explorer links.