Skip to content

[OpenACC][CIR] Implement beginning of 'copy' lowering for compute con… #140304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 19, 2025

Conversation

erichkeane
Copy link
Collaborator

…structs

This is a partial implementation of the 'copy' lowering. It is missing 3 things, which are coming in future patches:

1- does not handle subscript/subarrays for emission as variables 2- does not handle member expressions for emissions as variables 3- does not handle modifier-list

1 and 2 are because of the complexity and should be split off into a separate patch. 3 is because it isn't clear how the IR is going to handle this, and I'd like to make sure it gets done 'all at once' when the IR is updated to handle these, so I'm pushing that off to the future.

This DOES however handle the complexity of having a acc.copyin and acc.copyout, plus the additional complexity of the 'async' clause.

…structs

This is a partial implementation of the 'copy' lowering. It is missing 3
things, which are coming in future patches:

1- does not handle subscript/subarrays for emission as variables
2- does not handle member expressions for emissions as variables
3- does not handle modifier-list

1 and 2 are because of the complexity and should be split off into a
separate patch.  3 is because it isn't clear how the IR is going to
handle this, and I'd like to make sure it gets done 'all at once' when
the IR is updated to handle these, so I'm pushing that off to the
future.

This DOES however handle the complexity of having a acc.copyin and acc.copyout,
plus the additional complexity of the 'async' clause.
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels May 16, 2025
@llvmbot
Copy link
Member

llvmbot commented May 16, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clangir

Author: Erich Keane (erichkeane)

Changes

…structs

This is a partial implementation of the 'copy' lowering. It is missing 3 things, which are coming in future patches:

1- does not handle subscript/subarrays for emission as variables 2- does not handle member expressions for emissions as variables 3- does not handle modifier-list

1 and 2 are because of the complexity and should be split off into a separate patch. 3 is because it isn't clear how the IR is going to handle this, and I'd like to make sure it gets done 'all at once' when the IR is updated to handle these, so I'm pushing that off to the future.

This DOES however handle the complexity of having a acc.copyin and acc.copyout, plus the additional complexity of the 'async' clause.


Patch is 35.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140304.diff

4 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenOpenACCClause.h (+211-15)
  • (modified) clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp (+3-5)
  • (modified) clang/lib/CIR/CodeGen/CIRGenStmtOpenACCLoop.cpp (+1-1)
  • (added) clang/test/CIR/CodeGenOpenACC/compute-copy.c (+213)
diff --git a/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.h b/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.h
index 9adbe6a497214..a6f0ee28c813e 100644
--- a/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.h
+++ b/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.h
@@ -14,6 +14,7 @@
 
 #include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/Dialect/OpenACC/OpenACC.h"
+#include "llvm/ADT/TypeSwitch.h"
 namespace clang {
 // Simple type-trait to see if the first template arg is one of the list, so we
 // can tell whether to `if-constexpr` a bunch of stuff.
@@ -36,6 +37,76 @@ template <typename ToTest> constexpr bool isCombinedType = false;
 template <typename T>
 constexpr bool isCombinedType<CombinedConstructClauseInfo<T>> = true;
 
+namespace {
+struct DataOperandInfo {
+  mlir::Location beginLoc;
+  mlir::Value varValue;
+  llvm::StringRef name;
+  mlir::ValueRange bounds;
+
+  DataOperandInfo(mlir::Location beginLoc, mlir::Value varValue,
+                  llvm::StringRef name, mlir::ValueRange bounds)
+      : beginLoc(beginLoc), varValue(varValue), name(name), bounds(bounds) {}
+};
+
+inline mlir::Value createIntExpr(CIRGen::CIRGenFunction &cgf,
+                                 CIRGen::CIRGenBuilderTy &builder,
+                                 const Expr *intExpr) {
+  mlir::Value expr = cgf.emitScalarExpr(intExpr);
+  mlir::Location exprLoc = cgf.cgm.getLoc(intExpr->getBeginLoc());
+
+  mlir::IntegerType targetType = mlir::IntegerType::get(
+      &cgf.getMLIRContext(), cgf.getContext().getIntWidth(intExpr->getType()),
+      intExpr->getType()->isSignedIntegerOrEnumerationType()
+          ? mlir::IntegerType::SignednessSemantics::Signed
+          : mlir::IntegerType::SignednessSemantics::Unsigned);
+
+  auto conversionOp = builder.create<mlir::UnrealizedConversionCastOp>(
+      exprLoc, targetType, expr);
+  return conversionOp.getResult(0);
+}
+
+// A helper function that gets the information from an operand to a data
+// clause, so that it can be used to emit the data operations.
+inline DataOperandInfo getDataOperandInfo(CIRGen::CIRGenFunction &cgf,
+                                          CIRGen::CIRGenBuilderTy &builder,
+                                          OpenACCDirectiveKind DK,
+                                          const Expr *E) {
+  // TODO: OpenACC: Cache was different enough as to need a separate
+  // `ActOnCacheVar`, so we are going to need to do some investigations here
+  // when it comes to implement this for cache.
+  assert(DK != OpenACCDirectiveKind::Cache &&
+         "Cache has different enough functionality we need to investigate "
+         "whether this function works for it");
+  const Expr *curVarExpr = E->IgnoreParenImpCasts();
+
+  mlir::Location exprLoc = cgf.cgm.getLoc(curVarExpr->getBeginLoc());
+  llvm::SmallVector<mlir::Value> bounds;
+
+  // TODO: OpenACC: Assemble the list of bounds.
+  if (isa<ArraySectionExpr, ArraySubscriptExpr>(curVarExpr)) {
+    cgf.cgm.errorNYI(curVarExpr->getSourceRange(),
+                     "OpenACC data clause array subscript/section");
+    return {exprLoc, {}, {}, bounds};
+  }
+
+  // TODO: OpenACC: if this is a member expr, emit the VarPtrPtr correctly.
+  if (const auto *ME = dyn_cast<MemberExpr>(curVarExpr)) {
+    cgf.cgm.errorNYI(curVarExpr->getSourceRange(),
+                     "OpenACC Data clause member expr");
+    return {exprLoc, {}, {}, bounds};
+  }
+
+  // Sema has made sure that only 4 types of things can get here, array
+  // subscript, array section, member expr, or DRE to a var decl (or the former
+  // 3 wrapping a var-decl), so we should be able to assume this is right.
+  const auto *DRE = cast<DeclRefExpr>(curVarExpr);
+  const auto *VD = cast<VarDecl>(DRE->getFoundDecl()->getCanonicalDecl());
+  return {exprLoc, cgf.emitDeclRefLValue(DRE).getPointer(), VD->getName(),
+          bounds};
+}
+} //  namespace
+
 template <typename OpTy>
 class OpenACCClauseCIREmitter final
     : public OpenACCClauseVisitor<OpenACCClauseCIREmitter<OpTy>> {
@@ -54,6 +125,11 @@ class OpenACCClauseCIREmitter final
   SourceLocation dirLoc;
 
   llvm::SmallVector<mlir::acc::DeviceType> lastDeviceTypeValues;
+  // Keep track of the async-clause so that we can shortcut updating the data
+  // operands async clauses.
+  bool hasAsyncClause = false;
+  // Keep track of the data operands so that we can update their async clauses.
+  llvm::SmallVector<mlir::Operation *> dataOperands;
 
   void setLastDeviceTypeClause(const OpenACCDeviceTypeClause &clause) {
     lastDeviceTypeValues.clear();
@@ -70,18 +146,7 @@ class OpenACCClauseCIREmitter final
   }
 
   mlir::Value createIntExpr(const Expr *intExpr) {
-    mlir::Value expr = cgf.emitScalarExpr(intExpr);
-    mlir::Location exprLoc = cgf.cgm.getLoc(intExpr->getBeginLoc());
-
-    mlir::IntegerType targetType = mlir::IntegerType::get(
-        &cgf.getMLIRContext(), cgf.getContext().getIntWidth(intExpr->getType()),
-        intExpr->getType()->isSignedIntegerOrEnumerationType()
-            ? mlir::IntegerType::SignednessSemantics::Signed
-            : mlir::IntegerType::SignednessSemantics::Unsigned);
-
-    auto conversionOp = builder.create<mlir::UnrealizedConversionCastOp>(
-        exprLoc, targetType, expr);
-    return conversionOp.getResult(0);
+    return clang::createIntExpr(cgf, builder, intExpr);
   }
 
   // 'condition' as an OpenACC grammar production is used for 'if' and (some
@@ -157,6 +222,103 @@ class OpenACCClauseCIREmitter final
     computeEmitter.Visit(&c);
   }
 
+  template <typename BeforeOpTy, typename AfterOpTy>
+  void addDataOperand(const Expr *varOperand, mlir::acc::DataClause dataClause,
+                      bool structured, bool implicit) {
+    DataOperandInfo opInfo =
+        getDataOperandInfo(cgf, builder, dirKind, varOperand);
+
+    // TODO: OpenACC: we should comprehend the 'modifier-list' here for the data
+    // operand. At the moment, we don't have a uniform way to assign these
+    // properly, and the dialect cannot represent anything other than 'readonly'
+    // and 'zero' on copyin/copyout/create, so for now, we skip it.
+
+    auto beforeOp =
+        builder.create<BeforeOpTy>(opInfo.beginLoc, opInfo.varValue, structured,
+                                   implicit, opInfo.name, opInfo.bounds);
+    operation.getDataClauseOperandsMutable().append(beforeOp.getResult());
+
+    AfterOpTy afterOp;
+    {
+      mlir::OpBuilder::InsertionGuard guardCase(builder);
+      builder.setInsertionPointAfter(operation);
+      afterOp = builder.create<AfterOpTy>(opInfo.beginLoc, beforeOp.getResult(),
+                                          opInfo.varValue, structured, implicit,
+                                          opInfo.name, opInfo.bounds);
+    }
+
+    // Set the 'rest' of the info for both operations.
+    beforeOp.setDataClause(dataClause);
+    afterOp.setDataClause(dataClause);
+
+    // Make sure we record these, so 'async' values can be updated later.
+    dataOperands.push_back(beforeOp.getOperation());
+    dataOperands.push_back(afterOp.getOperation());
+  }
+
+  // Helper function that covers for the fact that we don't have this function
+  // on all operation types.
+  mlir::ArrayAttr getAsyncOnlyAttr() {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp, mlir::acc::DataOp>)
+      return operation.getAsyncOnlyAttr();
+
+    // Note: 'wait' has async as well, but it cannot have data clauses, so we
+    // don't have to handle them here.
+
+    llvm_unreachable("getting asyncOnly when clause not valid on operation?");
+  }
+
+  // Helper function that covers for the fact that we don't have this function
+  // on all operation types.
+  mlir::ArrayAttr getAsyncOperandsDeviceTypeAttr() {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp, mlir::acc::DataOp>)
+      return operation.getAsyncOperandsDeviceTypeAttr();
+
+    // Note: 'wait' has async as well, but it cannot have data clauses, so we
+    // don't have to handle them here.
+
+    llvm_unreachable(
+        "getting asyncOperandsDeviceType when clause not valid on operation?");
+  }
+
+  // Helper function that covers for the fact that we don't have this function
+  // on all operation types.
+  mlir::OperandRange getAsyncOperands() {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp, mlir::acc::DataOp>)
+      return operation.getAsyncOperands();
+
+    // Note: 'wait' has async as well, but it cannot have data clauses, so we
+    // don't have to handle them here.
+
+    llvm_unreachable(
+        "getting asyncOperandsDeviceType when clause not valid on operation?");
+  }
+
+  // The 'data' clauses all require that we add the 'async' values from the
+  // operation to them. We've collected the data operands along the way, so use
+  // that list to get the current 'async' values.
+  void updateDataOperandAsyncValues() {
+    if (!hasAsyncClause || dataOperands.empty())
+      return;
+
+    // TODO: OpenACC: Handle this correctly for combined constructs.
+
+    for (mlir::Operation *dataOp : dataOperands) {
+      llvm::TypeSwitch<mlir::Operation *, void>(dataOp)
+          .Case<ACC_DATA_ENTRY_OPS, ACC_DATA_EXIT_OPS>([&](auto op) {
+            op.setAsyncOnlyAttr(getAsyncOnlyAttr());
+            op.setAsyncOperandsDeviceTypeAttr(getAsyncOperandsDeviceTypeAttr());
+            op.getAsyncOperandsMutable().assign(getAsyncOperands());
+          })
+          .Default([&](mlir::Operation *) {
+            llvm_unreachable("Not a data operation?");
+          });
+    }
+  }
+
 public:
   OpenACCClauseCIREmitter(OpTy &operation, CIRGen::CIRGenFunction &cgf,
                           CIRGen::CIRGenBuilderTy &builder,
@@ -168,6 +330,14 @@ class OpenACCClauseCIREmitter final
     clauseNotImplemented(clause);
   }
 
+  // The entry point for the CIR emitter. All users should use this rather than
+  // 'visitClauseList', as this also handles the things that have to happen
+  // 'after' the clauses are all visited.
+  void emitClauses(ArrayRef<const OpenACCClause *> clauses) {
+    this->VisitClauseList(clauses);
+    updateDataOperandAsyncValues();
+  }
+
   void VisitDefaultClause(const OpenACCDefaultClause &clause) {
     // This type-trait checks if 'op'(the first arg) is one of the mlir::acc
     // operations listed in the rest of the arguments.
@@ -250,14 +420,26 @@ class OpenACCClauseCIREmitter final
   }
 
   void VisitAsyncClause(const OpenACCAsyncClause &clause) {
+    hasAsyncClause = true;
     if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
                                mlir::acc::KernelsOp, mlir::acc::DataOp>) {
       if (!clause.hasIntExpr())
         operation.addAsyncOnly(builder.getContext(), lastDeviceTypeValues);
-      else
-        operation.addAsyncOperand(builder.getContext(),
-                                  createIntExpr(clause.getIntExpr()),
+      else {
+
+        mlir::Value intExpr;
+        {
+          // Async int exprs can be referenced by the data operands, which means
+          // that the int-exprs have to appear before them.  IF there is a data
+          // operand already, set the insertion point to 'before' it.
+          mlir::OpBuilder::InsertionGuard guardCase(builder);
+          if (!dataOperands.empty())
+            builder.setInsertionPoint(dataOperands.front());
+          intExpr = createIntExpr(clause.getIntExpr());
+        }
+        operation.addAsyncOperand(builder.getContext(), intExpr,
                                   lastDeviceTypeValues);
+      }
     } else if constexpr (isOneOfTypes<OpTy, mlir::acc::WaitOp>) {
       // Wait doesn't have a device_type, so its handling here is slightly
       // different.
@@ -527,6 +709,20 @@ class OpenACCClauseCIREmitter final
       llvm_unreachable("Unknown construct kind in VisitGangClause");
     }
   }
+
+  void VisitCopyClause(const OpenACCCopyClause &clause) {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp>) {
+      for (auto Var : clause.getVarList())
+        addDataOperand<mlir::acc::CopyinOp, mlir::acc::CopyoutOp>(
+            Var, mlir::acc::DataClause::acc_copy, /*structured=*/true,
+            /*implicit=*/false);
+    } else {
+      // TODO: When we've implemented this for everything, switch this to an
+      // unreachable. data, declare, combined constructs remain.
+      return clauseNotImplemented(clause);
+    }
+  }
 };
 
 template <typename OpTy>
diff --git a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
index fc76f57ce7c29..3c18f5d9e205c 100644
--- a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
@@ -39,8 +39,7 @@ mlir::LogicalResult CIRGenFunction::emitOpenACCOpAssociatedStmt(
     // Sets insertion point before the 'op', since every new expression needs to
     // be before the operation.
     builder.setInsertionPoint(op);
-    makeClauseEmitter(op, *this, builder, dirKind, dirLoc)
-        .VisitClauseList(clauses);
+    makeClauseEmitter(op, *this, builder, dirKind, dirLoc).emitClauses(clauses);
   }
 
   {
@@ -115,7 +114,7 @@ mlir::LogicalResult CIRGenFunction::emitOpenACCOpCombinedConstruct(
       // We don't bother setting the insertion point, since the clause emitter
       // is going to have to do this correctly.
       makeClauseEmitter(inf, *this, builder, dirKind, dirLoc)
-          .VisitClauseList(clauses);
+          .emitClauses(clauses);
     }
 
     builder.create<TermOp>(end);
@@ -137,8 +136,7 @@ Op CIRGenFunction::emitOpenACCOp(
     // Sets insertion point before the 'op', since every new expression needs to
     // be before the operation.
     builder.setInsertionPoint(op);
-    makeClauseEmitter(op, *this, builder, dirKind, dirLoc)
-        .VisitClauseList(clauses);
+    makeClauseEmitter(op, *this, builder, dirKind, dirLoc).emitClauses(clauses);
   }
   return op;
 }
diff --git a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACCLoop.cpp b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACCLoop.cpp
index 2b78bc1a6d4a5..8a868fdc96350 100644
--- a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACCLoop.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACCLoop.cpp
@@ -96,7 +96,7 @@ CIRGenFunction::emitOpenACCLoopConstruct(const OpenACCLoopConstruct &s) {
     builder.setInsertionPoint(op);
     makeClauseEmitter(op, *this, builder, s.getDirectiveKind(),
                       s.getDirectiveLoc())
-        .VisitClauseList(s.clauses());
+        .emitClauses(s.clauses());
   }
 
   mlir::LogicalResult stmtRes = mlir::success();
diff --git a/clang/test/CIR/CodeGenOpenACC/compute-copy.c b/clang/test/CIR/CodeGenOpenACC/compute-copy.c
new file mode 100644
index 0000000000000..a542409f07152
--- /dev/null
+++ b/clang/test/CIR/CodeGenOpenACC/compute-copy.c
@@ -0,0 +1,213 @@
+// RUN: %clang_cc1 -fopenacc -Wno-openacc-self-if-potential-conflict -emit-cir -fclangir %s -o - | FileCheck %s
+
+int global;
+void acc_compute(int parmVar) {
+  // CHECK: cir.func @acc_compute(%[[ARG:.*]]: !s32i{{.*}}) {
+  // CHECK-NEXT: %[[PARM:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["parmVar", init]
+  int localVar1;
+  short localVar2;
+  float localVar3;
+  // CHECK-NEXT: %[[LOCAL1:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["localVar1"]
+  // CHECK-NEXT: %[[LOCAL2:.*]] = cir.alloca !s16i, !cir.ptr<!s16i>, ["localVar2"]
+  // CHECK-NEXT: %[[LOCAL3:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["localVar3"] 
+  // CHECK-NEXT: %[[LOCALPTR:.*]] = cir.alloca !cir.ptr<!s16i>, !cir.ptr<!cir.ptr<!s16i>>, ["localPointer"]
+  // CHECK-NEXT: %[[LOCALARRAY:.*]] = cir.alloca !cir.array<!cir.float x 5>, !cir.ptr<!cir.array<!cir.float x 5>>, ["localArray"]
+  // CHECK-NEXT: cir.store %[[ARG]], %[[PARM]] : !s32i, !cir.ptr<!s32i>
+
+#pragma acc parallel copy(localVar1, parmVar) copy(localVar2) copy(localVar3, parmVar)
+  ;
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[LOCAL1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "localVar1"} loc
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: %[[COPYIN3:.*]] = acc.copyin varPtr(%[[LOCAL2]] : !cir.ptr<!s16i>) -> !cir.ptr<!s16i> {dataClause = #acc<data_clause acc_copy>, name = "localVar2"} loc
+  // CHECK-NEXT: %[[COPYIN4:.*]] = acc.copyin varPtr(%[[LOCAL3]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {dataClause = #acc<data_clause acc_copy>, name = "localVar3"} loc
+  // CHECK-NEXT: %[[COPYIN5:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: acc.parallel dataOperands(%[[COPYIN1]], %[[COPYIN2]], %[[COPYIN3]], %[[COPYIN4]], %[[COPYIN5]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!s16i>, !cir.ptr<!cir.float>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN5]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN4]] : !cir.ptr<!cir.float>) to varPtr(%[[LOCAL3]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_copy>, name = "localVar3"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN3]] : !cir.ptr<!s16i>) to varPtr(%[[LOCAL2]] : !cir.ptr<!s16i>) {dataClause = #acc<data_clause acc_copy>, name = "localVar2"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) to varPtr(%[[LOCAL1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copy>, name = "localVar1"} loc
+
+#pragma acc serial copy(localVar1, parmVar) copy(localVar2) copy(localVar3, parmVar)
+  ;
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[LOCAL1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "localVar1"} loc
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: %[[COPYIN3:.*]] = acc.copyin varPtr(%[[LOCAL2]] : !cir.ptr<!s16i>) -> !cir.ptr<!s16i> {dataClause = #acc<data_clause acc_copy>, name = "localVar2"} loc
+  // CHECK-NEXT: %[[COPYIN4:.*]] = acc.copyin varPtr(%[[LOCAL3]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {dataClause = #acc<data_clause acc_copy>, name = "localVar3"} loc
+  // CHECK-NEXT: %[[COPYIN5:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: acc.serial dataOperands(%[[COPYIN1]], %[[COPYIN2]], %[[COPYIN3]], %[[COPYIN4]], %[[COPYIN5]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!s16i>, !cir.ptr<!cir.float>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN5]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN4]] : !cir.ptr<!cir.float>) to varPtr(%[[LOCAL3]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_copy>, name = "localVar3"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN3]] : !cir.ptr<!s16i>) to varPtr(%[[LOCAL2]] : !cir.ptr<!s16i>) {dataClause = #acc<data_clause acc_copy>, name = "localVar2"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copy>, name = "parmVar"} loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) to varPtr(%[[LOCAL1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copy>, name = "localVar1"} loc
+
+#pragma acc kernels copy(localVar1, parmVar) copy(localVar2) copy(localVar3, parmVar)
+  ;
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[LOCAL1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copy>, name = "localVar1"} loc
+  // CHECK-NEXT: %[[COPYIN2:....
[truncated]

mlir::OpBuilder::InsertionGuard guardCase(builder);
builder.setInsertionPointAfter(operation);
afterOp = builder.create<AfterOpTy>(opInfo.beginLoc, beforeOp.getResult(),
opInfo.varValue, structured, implicit,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOES the 'implicit' and 'structured' ALWAYS match the 'before'? Or do I need a separate variable for that @razvanlupusoru and @clementval ?

// operand already, set the insertion point to 'before' it.
mlir::OpBuilder::InsertionGuard guardCase(builder);
if (!dataOperands.empty())
builder.setInsertionPoint(dataOperands.front());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered putting these at the beginning of the 'block', but that ends up moving them very far away, which made for less readable IR. Instead, I just make sure they are before the first data operation.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly variable naming style comments from me

llvm::StringRef name;
mlir::ValueRange bounds;

DataOperandInfo(mlir::Location beginLoc, mlir::Value varValue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an explicit constructor necessary? Do you intend to add something other than 1:1 member initialization later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some oddity I never looked into about the mlir::ValueRange which resulted in a warning about a dangling reference. Instead, I've decided to remove the bounds from this, and I'll deal with it in a future patch, which means I can remove this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some oddity I never looked into about the mlir::ValueRange which resulted in a warning about a dangling reference. Instead, I've decided to remove the '

// clause, so that it can be used to emit the data operations.
inline DataOperandInfo getDataOperandInfo(CIRGen::CIRGenFunction &cgf,
CIRGen::CIRGenBuilderTy &builder,
OpenACCDirectiveKind DK,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OpenACCDirectiveKind DK,
OpenACCDirectiveKind dk,

inline DataOperandInfo getDataOperandInfo(CIRGen::CIRGenFunction &cgf,
CIRGen::CIRGenBuilderTy &builder,
OpenACCDirectiveKind DK,
const Expr *E) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const Expr *E) {
const Expr *e) {

// TODO: OpenACC: Cache was different enough as to need a separate
// `ActOnCacheVar`, so we are going to need to do some investigations here
// when it comes to implement this for cache.
assert(DK != OpenACCDirectiveKind::Cache &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not errorNYI here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know actually... I think its the right idea.

}

// TODO: OpenACC: if this is a member expr, emit the VarPtrPtr correctly.
if (const auto *ME = dyn_cast<MemberExpr>(curVarExpr)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (const auto *ME = dyn_cast<MemberExpr>(curVarExpr)) {
if (isa<MemberExpr>(curVarExpr)) {

// Sema has made sure that only 4 types of things can get here, array
// subscript, array section, member expr, or DRE to a var decl (or the former
// 3 wrapping a var-decl), so we should be able to assume this is right.
const auto *DRE = cast<DeclRefExpr>(curVarExpr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto *DRE = cast<DeclRefExpr>(curVarExpr);
const auto *dre = cast<DeclRefExpr>(curVarExpr);

// subscript, array section, member expr, or DRE to a var decl (or the former
// 3 wrapping a var-decl), so we should be able to assume this is right.
const auto *DRE = cast<DeclRefExpr>(curVarExpr);
const auto *VD = cast<VarDecl>(DRE->getFoundDecl()->getCanonicalDecl());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto *VD = cast<VarDecl>(DRE->getFoundDecl()->getCanonicalDecl());
const auto *vd = cast<VarDecl>(DRE->getFoundDecl()->getCanonicalDecl());

void VisitCopyClause(const OpenACCCopyClause &clause) {
if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
mlir::acc::KernelsOp>) {
for (auto Var : clause.getVarList())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (auto Var : clause.getVarList())
for (auto var : clause.getVarList())

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question in the comments but otherwise PR looks good: adds boilerplate and initial cases, deferring more work for incremental PRs, way to go.

: beginLoc(beginLoc), varValue(varValue), name(name), bounds(bounds) {}
};

inline mlir::Value createIntExpr(CIRGen::CIRGenFunction &cgf,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is necessary because of templates, but the amount of code in the header file is a bit scary, any chance we can move some of this to implementation files? If not, perhaps use a more specific name for things living inside this namespace? createIntExpr for instance is a bit misleading (sounds simple but ends up doing an unrealized cast).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sigh. I was thinking about that over the weekend. After this patch I'm going to try to re-factor how clause-emitting works entirely. For now, I'll change this to emitOpenACCIntExpr to be more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks!

Copy link
Collaborator Author

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've done all as requested. Would love a 2nd review/approval from Bruno.

@bcardosolopes : If it is alright, I'd like to just do the OpenACC clause refactor in a followup patch. I'll work on it 'next', immediately after this one. I'd originally done the visitor, then realized it needed to be accessed from 3 places.

I should have just defined a single 'emit' function and moved it all to a .h file, but it wasn't clear how 'templaty' I had to be. In the end, the 'emit' visitor needs to be constructed with a finite number of types, so I can use explicit specializations to handle it.

llvm::StringRef name;
mlir::ValueRange bounds;

DataOperandInfo(mlir::Location beginLoc, mlir::Value varValue,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some oddity I never looked into about the mlir::ValueRange which resulted in a warning about a dangling reference. Instead, I've decided to remove the bounds from this, and I'll deal with it in a future patch, which means I can remove this.

: beginLoc(beginLoc), varValue(varValue), name(name), bounds(bounds) {}
};

inline mlir::Value createIntExpr(CIRGen::CIRGenFunction &cgf,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sigh. I was thinking about that over the weekend. After this patch I'm going to try to re-factor how clause-emitting works entirely. For now, I'll change this to emitOpenACCIntExpr to be more clear.

// TODO: OpenACC: Cache was different enough as to need a separate
// `ActOnCacheVar`, so we are going to need to do some investigations here
// when it comes to implement this for cache.
assert(DK != OpenACCDirectiveKind::Cache &&
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know actually... I think its the right idea.

Copy link

github-actions bot commented May 19, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcardosolopes : If it is alright, I'd like to just do the OpenACC clause refactor in a followup patch. I'll work on it 'next', immediately after this one. I'd originally done the visitor, then realized it needed to be accessed from 3 places.

Sure thing!

@erichkeane erichkeane merged commit db4c94f into llvm:main May 19, 2025
8 of 10 checks passed
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
llvm#140304)

…structs

This is a partial implementation of the 'copy' lowering. It is missing 3
things, which are coming in future patches:

1- does not handle subscript/subarrays for emission as variables 2- does
not handle member expressions for emissions as variables 3- does not
handle modifier-list

1 and 2 are because of the complexity and should be split off into a
separate patch. 3 is because it isn't clear how the IR is going to
handle this, and I'd like to make sure it gets done 'all at once' when
the IR is updated to handle these, so I'm pushing that off to the
future.

This DOES however handle the complexity of having a acc.copyin and
acc.copyout, plus the additional complexity of the 'async' clause.
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
llvm#140304)

…structs

This is a partial implementation of the 'copy' lowering. It is missing 3
things, which are coming in future patches:

1- does not handle subscript/subarrays for emission as variables 2- does
not handle member expressions for emissions as variables 3- does not
handle modifier-list

1 and 2 are because of the complexity and should be split off into a
separate patch. 3 is because it isn't clear how the IR is going to
handle this, and I'd like to make sure it gets done 'all at once' when
the IR is updated to handle these, so I'm pushing that off to the
future.

This DOES however handle the complexity of having a acc.copyin and
acc.copyout, plus the additional complexity of the 'async' clause.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants