Skip to content

Conversation

clementval
Copy link
Contributor

@clementval clementval commented Nov 16, 2024

The runtime Assign function is not meant to initialize an array from a scalar. For that we need to use DoAssignFromSource. Update the data transfer from scalar to descriptor to use a new entry point that use this function underneath.

@clementval clementval merged commit 43cb424 into llvm:main Nov 16, 2024
8 of 9 checks passed
@clementval clementval deleted the cuf_assign_from_source branch November 16, 2024 01:41
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 16, 2024

LLVM Buildbot has detected a new failure on builder flang-runtime-cuda-gcc running on as-builder-7 while building flang at step 5 "build-FortranRuntime".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/152/builds/654

Here is the relevant piece of the build log for the reference
Step 5 (build-FortranRuntime) failure: build (failure)
...
0.085 [1/12/53] Building CUDA object CMakeFiles/FortranRuntime.dir/namelist.cpp.o
0.088 [1/11/54] Building CUDA object CMakeFiles/FortranRuntime.dir/unit.cpp.o
0.090 [1/10/55] Building CUDA object CMakeFiles/FortranRuntime.dir/io-stmt.cpp.o
0.100 [1/9/56] Building CUDA object CMakeFiles/FortranRuntime.dir/io-api.cpp.o
0.126 [1/8/57] Building CUDA object CMakeFiles/FortranRuntime.dir/dot-product.cpp.o
0.150 [1/7/58] Building CUDA object CMakeFiles/FortranRuntime.dir/extrema.cpp.o
0.217 [1/6/59] Building CUDA object CMakeFiles/FortranRuntime.dir/matmul-transpose.cpp.o
0.258 [1/5/60] Building CUDA object CMakeFiles/FortranRuntime.dir/matmul.cpp.o
0.338 [1/4/61] Building CUDA object CMakeFiles/FortranRuntime.dir/findloc.cpp.o
6.322 [1/3/62] Building CUDA object CMakeFiles/FortranRuntime.dir/allocatable.cpp.o
FAILED: CMakeFiles/FortranRuntime.dir/allocatable.cpp.o 
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -DRT_USE_LIBCUDACXX=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/../include -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/build -I/home/buildbot/worker/third-party/nv/cccl/libcudacxx/include -G -g -O3 -DNDEBUG --generate-code=arch=compute_80,code=[compute_80,sm_80]   -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -std=c++17  -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti --expt-relaxed-constexpr -Xcudafe --diag_suppress=20208 -Xcudafe --display_error_number -MD -MT CMakeFiles/FortranRuntime.dir/allocatable.cpp.o -MF CMakeFiles/FortranRuntime.dir/allocatable.cpp.o.d -x cu -dc /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/allocatable.cpp -o CMakeFiles/FortranRuntime.dir/allocatable.cpp.o
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/../include/flang/Runtime/assign.h(47): error: function "Fortran::runtime::MemmoveWrapper" has already been defined
  static __attribute__((host)) __attribute__((device)) void *MemmoveWrapper(
                                                             ^

1 error detected in the compilation of "/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/allocatable.cpp".
6.833 [1/2/63] Building CUDA object CMakeFiles/FortranRuntime.dir/assign.cpp.o
FAILED: CMakeFiles/FortranRuntime.dir/assign.cpp.o 
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -DRT_USE_LIBCUDACXX=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/../include -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/build -I/home/buildbot/worker/third-party/nv/cccl/libcudacxx/include -G -g -O3 -DNDEBUG --generate-code=arch=compute_80,code=[compute_80,sm_80]   -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -std=c++17  -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti --expt-relaxed-constexpr -Xcudafe --diag_suppress=20208 -Xcudafe --display_error_number -MD -MT CMakeFiles/FortranRuntime.dir/assign.cpp.o -MF CMakeFiles/FortranRuntime.dir/assign.cpp.o.d -x cu -dc /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/assign.cpp -o CMakeFiles/FortranRuntime.dir/assign.cpp.o
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/assign-impl.h(25): error: function "Fortran::runtime::MemmoveWrapper" has already been defined
  static __attribute__((host)) __attribute__((device)) void *MemmoveWrapper(
                                                             ^

1 error detected in the compilation of "/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/assign.cpp".
9.776 [1/1/64] Building CUDA object CMakeFiles/FortranRuntime.dir/pointer.cpp.o
ninja: build stopped: subcommand failed.

@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 16, 2024

LLVM Buildbot has detected a new failure on builder flang-runtime-cuda-clang running on as-builder-7 while building flang at step 10 "build-flang-runtime-FortranRuntime".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/7/builds/7277

Here is the relevant piece of the build log for the reference
Step 10 (build-flang-runtime-FortranRuntime) failure: cmake (failure)
...
2.697 [1/53/12] Building CXX object CMakeFiles/FortranRuntime.dir/random.cpp.o
2.895 [1/52/13] Building CXX object CMakeFiles/FortranRuntime.dir/time-intrinsic.cpp.o
2.939 [1/51/14] Building CXX object CMakeFiles/FortranRuntime.dir/stop.cpp.o
3.099 [1/50/15] Building CXX object CMakeFiles/FortranRuntime.dir/unit-map.cpp.o
3.229 [1/49/16] Building CXX object CMakeFiles/FortranRuntime.dir/allocator-registry.cpp.o
3.963 [1/48/17] Building CXX object CMakeFiles/FortranRuntime.dir/buffer.cpp.o
6.503 [1/47/18] Building CXX object CMakeFiles/FortranRuntime.dir/reduction.cpp.o
7.771 [1/46/19] Building CXX object CMakeFiles/FortranRuntime.dir/ISO_Fortran_binding.cpp.o
8.461 [1/45/20] Building CXX object CMakeFiles/FortranRuntime.dir/support.cpp.o
8.554 [1/44/21] Building CXX object CMakeFiles/FortranRuntime.dir/allocatable.cpp.o
FAILED: CMakeFiles/FortranRuntime.dir/allocatable.cpp.o 
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/install-clang/bin/clang++ -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DOMP_OFFLOAD_BUILD -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/../include -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/build/flang-runtime -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-lto -O3 -DNDEBUG   -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -std=c++17  -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -fopenmp -fvisibility=hidden -fopenmp-cuda-mode --offload-arch=sm_50,sm_60,sm_70,sm_80 -foffload-lto -MD -MT CMakeFiles/FortranRuntime.dir/allocatable.cpp.o -MF CMakeFiles/FortranRuntime.dir/allocatable.cpp.o.d -o CMakeFiles/FortranRuntime.dir/allocatable.cpp.o -c /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/allocatable.cpp
In file included from /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/allocatable.cpp:16:
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/../include/flang/Runtime/assign.h:47:27: error: redefinition of 'MemmoveWrapper'
   47 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/assign-impl.h:25:27: note: previous definition is here
   25 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
1 error generated.
In file included from /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/allocatable.cpp:16:
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/../include/flang/Runtime/assign.h:47:27: error: redefinition of 'MemmoveWrapper'
   47 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/assign-impl.h:25:27: note: previous definition is here
   25 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
1 error generated.
In file included from /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/allocatable.cpp:16:
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/../include/flang/Runtime/assign.h:47:27: error: redefinition of 'MemmoveWrapper'
   47 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/assign-impl.h:25:27: note: previous definition is here
   25 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
1 error generated.
In file included from /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/allocatable.cpp:16:
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/../include/flang/Runtime/assign.h:47:27: error: redefinition of 'MemmoveWrapper'
   47 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/assign-impl.h:25:27: note: previous definition is here
   25 | static RT_API_ATTRS void *MemmoveWrapper(
      |                           ^
1 error generated.
8.751 [1/43/22] Building CXX object CMakeFiles/FortranRuntime.dir/type-code.cpp.o
8.975 [1/42/23] Building CXX object CMakeFiles/FortranRuntime.dir/reduce.cpp.o
9.144 [1/41/24] Building CXX object CMakeFiles/FortranRuntime.dir/derived-api.cpp.o
9.450 [1/40/25] Building CXX object CMakeFiles/FortranRuntime.dir/environment.cpp.o
In file included from /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/environment.cpp:9:

@llvmbot llvmbot added flang:runtime flang Flang issues not falling into any other category flang:fir-hlfir labels Nov 16, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 16, 2024

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-runtime

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

The runtime Assign function is not meant to initialize an array from a scalar. For that we need to use DoAssignFromSource. Update the data transfer from scalar to descriptor to use a new entry point that use this function underneath.


Full diff: https://github.com/llvm/llvm-project/pull/116457.diff

6 Files Affected:

  • (modified) flang/include/flang/Runtime/CUDA/memory.h (+4)
  • (modified) flang/lib/Optimizer/Transforms/CUFOpConversion.cpp (+6-2)
  • (modified) flang/runtime/CUDA/memory.cpp (+19)
  • (modified) flang/runtime/assign-impl.h (+15-2)
  • (modified) flang/runtime/assign.cpp (+6-6)
  • (modified) flang/test/Fir/CUDA/cuda-data-transfer.fir (+6-6)
diff --git a/flang/include/flang/Runtime/CUDA/memory.h b/flang/include/flang/Runtime/CUDA/memory.h
index 713bdf536aaf90..2bb083b0dd75cb 100644
--- a/flang/include/flang/Runtime/CUDA/memory.h
+++ b/flang/include/flang/Runtime/CUDA/memory.h
@@ -44,6 +44,10 @@ void RTDECL(CUFDataTransferPtrDesc)(void *dst, Descriptor *src,
 void RTDECL(CUFDataTransferDescDesc)(Descriptor *dst, Descriptor *src,
     unsigned mode, const char *sourceFile = nullptr, int sourceLine = 0);
 
+/// Data transfer from a scalar descriptor to a descriptor.
+void RTDECL(CUFDataTransferCstDesc)(Descriptor *dst, Descriptor *src,
+    unsigned mode, const char *sourceFile = nullptr, int sourceLine = 0);
+
 /// Data transfer from a descriptor to a descriptor.
 void RTDECL(CUFDataTransferDescDescNoRealloc)(Descriptor *dst, Descriptor *src,
     unsigned mode, const char *sourceFile = nullptr, int sourceLine = 0);
diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp
index ec7f67dff763b4..9de20f0f0d45e1 100644
--- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp
+++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp
@@ -563,8 +563,9 @@ struct CUFDataTransferOpConversion
         // until we have more infrastructure.
         mlir::Value src = emboxSrc(rewriter, op, symtab);
         mlir::Value dst = emboxDst(rewriter, op, symtab);
-        mlir::func::FuncOp func = fir::runtime::getRuntimeFunc<mkRTKey(
-            CUFDataTransferDescDescNoRealloc)>(loc, builder);
+        mlir::func::FuncOp func =
+            fir::runtime::getRuntimeFunc<mkRTKey(CUFDataTransferCstDesc)>(
+                loc, builder);
         auto fTy = func.getFunctionType();
         mlir::Value sourceFile = fir::factory::locationToFilename(builder, loc);
         mlir::Value sourceLine =
@@ -648,6 +649,9 @@ struct CUFDataTransferOpConversion
       mlir::Value src = op.getSrc();
       if (!mlir::isa<fir::BaseBoxType>(srcTy)) {
         src = emboxSrc(rewriter, op, symtab);
+        if (fir::isa_trivial(srcTy))
+          func = fir::runtime::getRuntimeFunc<mkRTKey(CUFDataTransferCstDesc)>(
+              loc, builder);
       }
       auto materializeBoxIfNeeded = [&](mlir::Value val) -> mlir::Value {
         if (mlir::isa<fir::EmboxOp>(val.getDefiningOp())) {
diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp
index 7b40b837e7666e..68963c4d7738ac 100644
--- a/flang/runtime/CUDA/memory.cpp
+++ b/flang/runtime/CUDA/memory.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "flang/Runtime/CUDA/memory.h"
+#include "../assign-impl.h"
 #include "../terminator.h"
 #include "flang/Runtime/CUDA/common.h"
 #include "flang/Runtime/CUDA/descriptor.h"
@@ -120,6 +121,24 @@ void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc,
       *dstDesc, *srcDesc, terminator, MaybeReallocate, memmoveFct);
 }
 
+void RTDECL(CUFDataTransferCstDesc)(Descriptor *dstDesc, Descriptor *srcDesc,
+    unsigned mode, const char *sourceFile, int sourceLine) {
+  MemmoveFct memmoveFct;
+  Terminator terminator{sourceFile, sourceLine};
+  if (mode == kHostToDevice) {
+    memmoveFct = &MemmoveHostToDevice;
+  } else if (mode == kDeviceToHost) {
+    memmoveFct = &MemmoveDeviceToHost;
+  } else if (mode == kDeviceToDevice) {
+    memmoveFct = &MemmoveDeviceToDevice;
+  } else {
+    terminator.Crash("host to host copy not supported");
+  }
+
+  Fortran::runtime::DoFromSourceAssign(
+      *dstDesc, *srcDesc, terminator, memmoveFct);
+}
+
 void RTDECL(CUFDataTransferDescDescNoRealloc)(Descriptor *dstDesc,
     Descriptor *srcDesc, unsigned mode, const char *sourceFile,
     int sourceLine) {
diff --git a/flang/runtime/assign-impl.h b/flang/runtime/assign-impl.h
index f07a501d1d1263..5db0bc81510bff 100644
--- a/flang/runtime/assign-impl.h
+++ b/flang/runtime/assign-impl.h
@@ -9,16 +9,29 @@
 #ifndef FORTRAN_RUNTIME_ASSIGN_IMPL_H_
 #define FORTRAN_RUNTIME_ASSIGN_IMPL_H_
 
+#include "flang/Runtime/freestanding-tools.h"
+
 namespace Fortran::runtime {
 class Descriptor;
 class Terminator;
 
+using MemmoveFct = void *(*)(void *, const void *, std::size_t);
+
 // Assign one object to another via allocate statement from source specifier.
 // Note that if allocate object and source expression have the same rank, the
 // value of the allocate object becomes the value provided; otherwise the value
 // of each element of allocate object becomes the value provided (9.7.1.2(7)).
-RT_API_ATTRS void DoFromSourceAssign(
-    Descriptor &, const Descriptor &, Terminator &);
+#ifdef RT_DEVICE_COMPILATION
+static RT_API_ATTRS void *MemmoveWrapper(
+    void *dest, const void *src, std::size_t count) {
+  return Fortran::runtime::memmove(dest, src, count);
+}
+RT_API_ATTRS void DoFromSourceAssign(Descriptor &, const Descriptor &,
+    Terminator &, MemmoveFct memmoveFct = &MemmoveWrapper);
+#else
+RT_API_ATTRS void DoFromSourceAssign(Descriptor &, const Descriptor &,
+    Terminator &, MemmoveFct memmoveFct = &Fortran::runtime::memmove);
+#endif
 
 } // namespace Fortran::runtime
 #endif // FORTRAN_RUNTIME_ASSIGN_IMPL_H_
diff --git a/flang/runtime/assign.cpp b/flang/runtime/assign.cpp
index 83c0b9c70ed0d1..8f0efaa376c198 100644
--- a/flang/runtime/assign.cpp
+++ b/flang/runtime/assign.cpp
@@ -509,8 +509,8 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from,
 
 RT_OFFLOAD_API_GROUP_BEGIN
 
-RT_API_ATTRS void DoFromSourceAssign(
-    Descriptor &alloc, const Descriptor &source, Terminator &terminator) {
+RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc,
+    const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) {
   if (alloc.rank() > 0 && source.rank() == 0) {
     // The value of each element of allocate object becomes the value of source.
     DescriptorAddendum *allocAddendum{alloc.Addendum()};
@@ -523,17 +523,17 @@ RT_API_ATTRS void DoFromSourceAssign(
            alloc.IncrementSubscripts(allocAt)) {
         Descriptor allocElement{*Descriptor::Create(*allocDerived,
             reinterpret_cast<void *>(alloc.Element<char>(allocAt)), 0)};
-        Assign(allocElement, source, terminator, NoAssignFlags);
+        Assign(allocElement, source, terminator, NoAssignFlags, memmoveFct);
       }
     } else { // intrinsic type
       for (std::size_t n{alloc.Elements()}; n-- > 0;
            alloc.IncrementSubscripts(allocAt)) {
-        Fortran::runtime::memmove(alloc.Element<char>(allocAt),
-            source.raw().base_addr, alloc.ElementBytes());
+        memmoveFct(alloc.Element<char>(allocAt), source.raw().base_addr,
+            alloc.ElementBytes());
       }
     }
   } else {
-    Assign(alloc, source, terminator, NoAssignFlags);
+    Assign(alloc, source, terminator, NoAssignFlags, memmoveFct);
   }
 }
 
diff --git a/flang/test/Fir/CUDA/cuda-data-transfer.fir b/flang/test/Fir/CUDA/cuda-data-transfer.fir
index 3209197e118d19..1ee44f3c6d97c9 100644
--- a/flang/test/Fir/CUDA/cuda-data-transfer.fir
+++ b/flang/test/Fir/CUDA/cuda-data-transfer.fir
@@ -38,7 +38,7 @@ func.func @_QPsub2() {
 // CHECK: fir.store %[[EMBOX]] to %[[TEMP_BOX]] : !fir.ref<!fir.box<i32>>
 // CHECK: %[[ADEV_BOX:.*]] = fir.convert %[[ADEV]]#0 : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<none>>
 // CHECK: %[[TEMP_CONV:.*]] = fir.convert %[[TEMP_BOX]] : (!fir.ref<!fir.box<i32>>) -> !fir.ref<!fir.box<none>>
-// CHECK: fir.call @_FortranACUFDataTransferDescDesc(%[[ADEV_BOX]], %[[TEMP_CONV]], %c0{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref<!fir.box<none>>, !fir.ref<!fir.box<none>>, i32, !fir.ref<i8>, i32) -> none
+// CHECK: fir.call @_FortranACUFDataTransferCstDesc(%[[ADEV_BOX]], %[[TEMP_CONV]], %c0{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref<!fir.box<none>>, !fir.ref<!fir.box<none>>, i32, !fir.ref<i8>, i32) -> none
 
 func.func @_QPsub3() {
   %0 = cuf.alloc !fir.box<!fir.heap<!fir.array<?xi32>>> {bindc_name = "adev", data_attr = #cuf.cuda<device>, uniq_name = "_QFsub3Eadev"} -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
@@ -58,7 +58,7 @@ func.func @_QPsub3() {
 // CHECK: fir.store %[[EMBOX]] to %[[TEMP_BOX]] : !fir.ref<!fir.box<i32>>
 // CHECK: %[[ADEV_BOX:.*]] = fir.convert %[[ADEV]]#0 : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<none>>
 // CHECK: %[[V_CONV:.*]] = fir.convert %[[TEMP_BOX]] : (!fir.ref<!fir.box<i32>>) -> !fir.ref<!fir.box<none>>
-// CHECK: fir.call @_FortranACUFDataTransferDescDesc(%[[ADEV_BOX]], %[[V_CONV]], %c0{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref<!fir.box<none>>, !fir.ref<!fir.box<none>>, i32, !fir.ref<i8>, i32) -> none
+// CHECK: fir.call @_FortranACUFDataTransferCstDesc(%[[ADEV_BOX]], %[[V_CONV]], %c0{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref<!fir.box<none>>, !fir.ref<!fir.box<none>>, i32, !fir.ref<i8>, i32) -> none
   
 func.func @_QPsub4() {
   %0 = cuf.alloc !fir.box<!fir.heap<!fir.array<?xi32>>> {bindc_name = "adev", data_attr = #cuf.cuda<device>, uniq_name = "_QFsub4Eadev"} -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
@@ -297,7 +297,7 @@ func.func @_QPscalar_to_array() {
 }
 
 // CHECK-LABEL: func.func @_QPscalar_to_array()
-// CHECK: _FortranACUFDataTransferDescDescNoRealloc
+// CHECK: _FortranACUFDataTransferCstDesc
 
 func.func @_QPtest_type() {
   %0 = cuf.alloc !fir.type<_QMbarTcmplx{id:i32,c:complex<f32>}> {bindc_name = "a", data_attr = #cuf.cuda<device>, uniq_name = "_QFtest_typeEa"} -> !fir.ref<!fir.type<_QMbarTcmplx{id:i32,c:complex<f32>}>>
@@ -344,7 +344,7 @@ func.func @_QPshape_shift() {
 }
 
 // CHECK-LABEL: func.func @_QPshape_shift()
-// CHECK: fir.call @_FortranACUFDataTransferDescDescNoRealloc
+// CHECK: fir.call @_FortranACUFDataTransferCstDesc
 
 func.func @_QPshape_shift2() {
   %c11 = arith.constant 11 : index
@@ -383,7 +383,7 @@ func.func @_QPdevice_addr_conv() {
 // CHECK: %[[DEV_ADDR:.*]] = fir.call @_FortranACUFGetDeviceAddress(%{{.*}}, %{{.*}}, %{{.*}}) : (!fir.llvm_ptr<i8>, !fir.ref<i8>, i32) -> !fir.llvm_ptr<i8>
 // CHECK: %[[DEV_ADDR_CONV:.*]] = fir.convert %[[DEV_ADDR]] : (!fir.llvm_ptr<i8>) -> !fir.ref<!fir.array<4xf32>>
 // CHECK: fir.embox %[[DEV_ADDR_CONV]](%{{.*}}) : (!fir.ref<!fir.array<4xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<4xf32>>
-// CHECK: fir.call @_FortranACUFDataTransferDescDescNoRealloc
+// CHECK: fir.call @_FortranACUFDataTransferCstDesc
 
 func.func @_QQchar_transfer() attributes {fir.bindc_name = "char_transfer"} {
   %c1 = arith.constant 1 : index
@@ -464,6 +464,6 @@ func.func @_QPlogical_cst() {
 // CHECK: %[[EMBOX:.*]] = fir.embox %[[CONST]] : (!fir.ref<!fir.logical<4>>) -> !fir.box<!fir.logical<4>>
 // CHECK: fir.store %[[EMBOX]] to %[[DESC]] : !fir.ref<!fir.box<!fir.logical<4>>>
 // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DESC]] : (!fir.ref<!fir.box<!fir.logical<4>>>) -> !fir.ref<!fir.box<none>>
-// CHECK: fir.call @_FortranACUFDataTransferDescDesc(%{{.*}}, %[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref<!fir.box<none>>, !fir.ref<!fir.box<none>>, i32, !fir.ref<i8>, i32) -> none
+// CHECK: fir.call @_FortranACUFDataTransferCstDesc(%{{.*}}, %[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref<!fir.box<none>>, !fir.ref<!fir.box<none>>, i32, !fir.ref<i8>, i32) -> none
 
 } // end of module

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:runtime flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants