-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Description
We currently have a dependency on libffi
in the CPU version of the offloading library. The dependency on libffi
is unnecessary and also prevents us from having a unified way to pass arguments to the offloading library. When we emit "kernels" for the CPU they are simply functions in a shared library, see https://godbolt.org/z/5j9r1x9xq.
define weak_odr protected void @__omp_offloading_10302_bd4d1_main_l3(ptr %dyn_ptr, i64 %x) {
entry:
%dyn_ptr.addr = alloca ptr, align 8
%x.addr = alloca i64, align 8
store ptr %dyn_ptr, ptr %dyn_ptr.addr, align 8
store i64 %x, ptr %x.addr, align 8
%0 = load i32, ptr %x.addr, align 4
ret void
}
What we should do instead is emit this kernel with a known argument that is simply a natively packed struct of all the arguments. This would allow us to pass the arguments as simply a pointer and size, similarly to how both CUDA and HSA allow you to pass arguments.
define weak_odr protected void @__omp_offloading_10302_bd4d1_main_l3(ptr noundef %args) {
entry:
%args.addr = alloca ptr, align 8
store ptr %args, ptr %args.addr, align 8
%0 = load ptr, ptr %args.addr, align 8
%dyn_ptr = getelementptr inbounds %struct.Args, ptr %0, i32 0, i32 0
%x = getelementptr inbounds %struct.Args, ptr %0, i32 0, i32 8
ret void
}
The code that generates this prologue is at https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGStmtOpenMP.cpp#L617, which is shared between all captured OpenMP statements.