Skip to content

Conversation

Prakhar-Dixit
Copy link
Contributor

Fixes #122227
The loop’s induction variable (%i) is used to compute two different indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop should contribute at most one non-invariant index.

Minimal example crashing :

#map = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>

func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref<2x2xf32>

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
    }

    return %alloc : memref<2x2xf32>
  }

The single loop %i contributes two indices (%row and %col) to the 2D memref access.

The permutation map expects one index per vectorized loop dimension, but here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to the same vector dimension (perm[0]).

@llvmbot
Copy link
Member

llvmbot commented Apr 20, 2025

@llvm/pr-subscribers-mlir-vector

@llvm/pr-subscribers-mlir

Author: Prakhar Dixit (Prakhar-Dixit)

Changes

Fixes #122227
The loop’s induction variable (%i) is used to compute two different indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop should contribute at most one non-invariant index.

Minimal example crashing :

#map = affine_map&lt;(d0)[s0] -&gt; (d0 mod s0)&gt;
#map1 = affine_map&lt;(d0)[s0] -&gt; (d0 floordiv s0)&gt;

func.func @<!-- -->single_loop_unrolling_2D_access_pattern(%arg0: index) -&gt; memref&lt;2x2xf32&gt; {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref&lt;2x2xf32&gt;

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref&lt;2x2xf32&gt;
    }

    return %alloc : memref&lt;2x2xf32&gt;
  }

The single loop %i contributes two indices (%row and %col) to the 2D memref access.

The permutation map expects one index per vectorized loop dimension, but here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to the same vector dimension (perm[0]).


Full diff: https://github.com/llvm/llvm-project/pull/136474.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp (+8-2)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir (+35)
diff --git a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
index 024e8ccb901de..f15ed4b0d5f84 100644
--- a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
+++ b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
@@ -141,8 +141,14 @@ static AffineMap makePermutationMap(
     unsigned countInvariantIndices = 0;
     for (unsigned dim = 0; dim < numIndices; ++dim) {
       if (!invariants.count(indices[dim])) {
-        assert(perm[kvp.second] == getAffineConstantExpr(0, context) &&
-               "permutationMap already has an entry along dim");
+        if (perm[kvp.second] != getAffineConstantExpr(0, context)) {
+          auto loopOp = cast<affine::AffineForOp>(kvp.first);
+          loopOp->emitError(
+              "loop induction variable is used in multiple indices, which is "
+              "unsupported for vectorization. Consider using nested loops "
+              "instead of a single loop with affine.apply.");
+          return AffineMap();
+        }
         perm[kvp.second] = getAffineDimExpr(dim, context);
       } else {
         ++countInvariantIndices;
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir
index 6c1a7c48c4cb1..c8c009f02212b 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir
@@ -9,3 +9,38 @@ func.func @unparallel_loop_reduction_unsupported(%in: memref<256x512xf32>, %out:
  }
  return
 }
+
+// -----
+
+#map = affine_map<(d0)[s0] -> (d0 mod s0)>
+#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>
+
+func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
+  %c2 = arith.constant 2 : index
+  %cst = arith.constant 1.0 : f32
+  %alloc = memref.alloc() : memref<2x2xf32>
+    
+    affine.for %i = 0 to 4 {
+      %row = affine.apply #map1(%i)[%c2]  
+      %col = affine.apply #map(%i)[%c2]  
+      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
+    }
+    
+    return %alloc : memref<2x2xf32>
+  }
+
+// CHECK: #[[$ATTR_0:.+]] = affine_map<(d0)[s0] -> (d0 floordiv s0)>
+// CHECK: #[[$ATTR_1:.+]] = affine_map<(d0)[s0] -> (d0 mod s0)>
+
+// CHECK-LABEL:   func.func @single_loop_unrolling_2D_access_pattern(
+// CHECK-SAME:                            %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: index) -> memref<2x2xf32> {
+// CHECK:           %[[VAL_1:.*]] = arith.constant 2 : index
+// CHECK:           %[[VAL_2:.*]] = arith.constant 1.000000e+00 : f32
+// CHECK:           %[[VAL_3:.*]] = memref.alloc() : memref<2x2xf32>
+// CHECK:           affine.for %[[VAL_4:.*]] = 0 to 4 {
+// CHECK:             %[[VAL_5:.*]] = affine.apply #[[$ATTR_0]](%[[VAL_4]]){{\[}}%[[VAL_1]]]
+// CHECK:             %[[VAL_6:.*]] = affine.apply #[[$ATTR_1]](%[[VAL_4]]){{\[}}%[[VAL_1]]]
+// CHECK:             affine.store %[[VAL_2]], %[[VAL_3]]{{\[}}%[[VAL_5]], %[[VAL_6]]] : memref<2x2xf32>
+// CHECK:           }
+// CHECK:           return %[[VAL_3]] : memref<2x2xf32>
+// CHECK:         }
\ No newline at end of file

@llvmbot
Copy link
Member

llvmbot commented Apr 20, 2025

@llvm/pr-subscribers-mlir-affine

Author: Prakhar Dixit (Prakhar-Dixit)

Changes

Fixes #122227
The loop’s induction variable (%i) is used to compute two different indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop should contribute at most one non-invariant index.

Minimal example crashing :

#map = affine_map&lt;(d0)[s0] -&gt; (d0 mod s0)&gt;
#map1 = affine_map&lt;(d0)[s0] -&gt; (d0 floordiv s0)&gt;

func.func @<!-- -->single_loop_unrolling_2D_access_pattern(%arg0: index) -&gt; memref&lt;2x2xf32&gt; {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref&lt;2x2xf32&gt;

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref&lt;2x2xf32&gt;
    }

    return %alloc : memref&lt;2x2xf32&gt;
  }

The single loop %i contributes two indices (%row and %col) to the 2D memref access.

The permutation map expects one index per vectorized loop dimension, but here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to the same vector dimension (perm[0]).


Full diff: https://github.com/llvm/llvm-project/pull/136474.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp (+8-2)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir (+35)
diff --git a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
index 024e8ccb901de..f15ed4b0d5f84 100644
--- a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
+++ b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
@@ -141,8 +141,14 @@ static AffineMap makePermutationMap(
     unsigned countInvariantIndices = 0;
     for (unsigned dim = 0; dim < numIndices; ++dim) {
       if (!invariants.count(indices[dim])) {
-        assert(perm[kvp.second] == getAffineConstantExpr(0, context) &&
-               "permutationMap already has an entry along dim");
+        if (perm[kvp.second] != getAffineConstantExpr(0, context)) {
+          auto loopOp = cast<affine::AffineForOp>(kvp.first);
+          loopOp->emitError(
+              "loop induction variable is used in multiple indices, which is "
+              "unsupported for vectorization. Consider using nested loops "
+              "instead of a single loop with affine.apply.");
+          return AffineMap();
+        }
         perm[kvp.second] = getAffineDimExpr(dim, context);
       } else {
         ++countInvariantIndices;
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir
index 6c1a7c48c4cb1..c8c009f02212b 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_unsupported.mlir
@@ -9,3 +9,38 @@ func.func @unparallel_loop_reduction_unsupported(%in: memref<256x512xf32>, %out:
  }
  return
 }
+
+// -----
+
+#map = affine_map<(d0)[s0] -> (d0 mod s0)>
+#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>
+
+func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
+  %c2 = arith.constant 2 : index
+  %cst = arith.constant 1.0 : f32
+  %alloc = memref.alloc() : memref<2x2xf32>
+    
+    affine.for %i = 0 to 4 {
+      %row = affine.apply #map1(%i)[%c2]  
+      %col = affine.apply #map(%i)[%c2]  
+      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
+    }
+    
+    return %alloc : memref<2x2xf32>
+  }
+
+// CHECK: #[[$ATTR_0:.+]] = affine_map<(d0)[s0] -> (d0 floordiv s0)>
+// CHECK: #[[$ATTR_1:.+]] = affine_map<(d0)[s0] -> (d0 mod s0)>
+
+// CHECK-LABEL:   func.func @single_loop_unrolling_2D_access_pattern(
+// CHECK-SAME:                            %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: index) -> memref<2x2xf32> {
+// CHECK:           %[[VAL_1:.*]] = arith.constant 2 : index
+// CHECK:           %[[VAL_2:.*]] = arith.constant 1.000000e+00 : f32
+// CHECK:           %[[VAL_3:.*]] = memref.alloc() : memref<2x2xf32>
+// CHECK:           affine.for %[[VAL_4:.*]] = 0 to 4 {
+// CHECK:             %[[VAL_5:.*]] = affine.apply #[[$ATTR_0]](%[[VAL_4]]){{\[}}%[[VAL_1]]]
+// CHECK:             %[[VAL_6:.*]] = affine.apply #[[$ATTR_1]](%[[VAL_4]]){{\[}}%[[VAL_1]]]
+// CHECK:             affine.store %[[VAL_2]], %[[VAL_3]]{{\[}}%[[VAL_5]], %[[VAL_6]]] : memref<2x2xf32>
+// CHECK:           }
+// CHECK:           return %[[VAL_3]] : memref<2x2xf32>
+// CHECK:         }
\ No newline at end of file

@Prakhar-Dixit
Copy link
Contributor Author

@ftynse

@Prakhar-Dixit Prakhar-Dixit requested a review from joker-eph April 22, 2025 08:19
@Prakhar-Dixit Prakhar-Dixit requested a review from joker-eph April 22, 2025 15:32
@Prakhar-Dixit
Copy link
Contributor Author

Prakhar-Dixit commented Apr 24, 2025

Just a gentle reminder!
@joker-eph

@Prakhar-Dixit Prakhar-Dixit requested a review from ftynse April 30, 2025 06:15
@ftynse ftynse merged commit c26db58 into llvm:main Apr 30, 2025
11 checks passed
@Prakhar-Dixit Prakhar-Dixit deleted the pd/affine-122227 branch May 1, 2025 02:21
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…136474)

Fixes  llvm#122227
The loop’s induction variable (%i) is used to compute two different
indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop
should contribute at most one non-invariant index.

**Minimal example crashing :**

```
#map = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>

func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref<2x2xf32>

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
    }

    return %alloc : memref<2x2xf32>
  }
```
  
The single loop %i contributes two indices (%row and %col) to the 2D
memref access.

The permutation map expects one index per vectorized loop dimension, but
here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to
the same vector dimension (perm[0]).
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…136474)

Fixes  llvm#122227
The loop’s induction variable (%i) is used to compute two different
indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop
should contribute at most one non-invariant index.

**Minimal example crashing :**

```
#map = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>

func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref<2x2xf32>

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
    }

    return %alloc : memref<2x2xf32>
  }
```
  
The single loop %i contributes two indices (%row and %col) to the 2D
memref access.

The permutation map expects one index per vectorized loop dimension, but
here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to
the same vector dimension (perm[0]).
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…136474)

Fixes  llvm#122227
The loop’s induction variable (%i) is used to compute two different
indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop
should contribute at most one non-invariant index.

**Minimal example crashing :**

```
#map = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>

func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref<2x2xf32>

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
    }

    return %alloc : memref<2x2xf32>
  }
```
  
The single loop %i contributes two indices (%row and %col) to the 2D
memref access.

The permutation map expects one index per vectorized loop dimension, but
here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to
the same vector dimension (perm[0]).
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
…136474)

Fixes  llvm#122227
The loop’s induction variable (%i) is used to compute two different
indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop
should contribute at most one non-invariant index.

**Minimal example crashing :**

```
#map = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>

func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref<2x2xf32>

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
    }

    return %alloc : memref<2x2xf32>
  }
```
  
The single loop %i contributes two indices (%row and %col) to the 2D
memref access.

The permutation map expects one index per vectorized loop dimension, but
here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to
the same vector dimension (perm[0]).
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
…136474)

Fixes  llvm#122227
The loop’s induction variable (%i) is used to compute two different
indices via affine.apply.
And the Vectorization Assumption is Violated i.e, Each vectorized loop
should contribute at most one non-invariant index.

**Minimal example crashing :**

```
#map = affine_map<(d0)[s0] -> (d0 mod s0)>
#map1 = affine_map<(d0)[s0] -> (d0 floordiv s0)>

func.func @single_loop_unrolling_2D_access_pattern(%arg0: index) -> memref<2x2xf32> {
  %c2 = arith.constant 2 : index
  %cst = arith.constant 1.0 : f32
  %alloc = memref.alloc() : memref<2x2xf32>

    affine.for %i = 0 to 4 {
      %row = affine.apply #map1(%i)[%c2]  
      %col = affine.apply #map(%i)[%c2]  
      affine.store %cst, %alloc[%row, %col] : memref<2x2xf32>
    }

    return %alloc : memref<2x2xf32>
  }
```
  
The single loop %i contributes two indices (%row and %col) to the 2D
memref access.

The permutation map expects one index per vectorized loop dimension, but
here one loop (%i) maps to two indices (dim=0 and dim=1).

The code detects this when trying to assign the second index (dim=1) to
the same vector dimension (perm[0]).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants