[mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order #107740

ubfx · 2024-09-08T07:16:42Z

This patch adds a quantized version of the linalg.conv2d_nchw_fchw Op. This is the "channel-first" ordering typically used by PyTorch and others.

This patch adds a quantized version of the `linalg.conv2d_nchw_fchw` Op. This is the "channel-first" ordering typically used by PyTorch and others.

llvmbot · 2024-09-08T07:17:12Z

@llvm/pr-subscribers-mlir-linalg

@llvm/pr-subscribers-mlir

Author: Felix Schneider (ubfx)

Changes

This patch adds a quantized version of the linalg.conv2d_nchw_fchw Op. This is the "channel-first" ordering typically used by PyTorch and others.

Full diff: https://github.com/llvm/llvm-project/pull/107740.diff

2 Files Affected:

(modified) mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml (+137)
(modified) mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py (+28)

diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml b/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
index 46b3ec0f60ebfa..4648a9133953af 100644
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
@@ -3114,6 +3114,143 @@ structured_op: !LinalgStructuredOpConfig
                     - !ScalarExpression
                       scalar_arg: KZp
 --- !LinalgOpConfig
+metadata: !LinalgOpMetadata
+  name: conv_2d_nchw_fchw_q
+  cpp_class_name: Conv2DNchwFchwQOp
+  doc: |-
+    Performs 2-D convolution with zero point offsets.
+
+    Layout:
+      * Input: NCHW.
+      * Kernel: FCHW.
+
+    Numeric casting is performed on the operands to the inner multiply, promoting
+    them to the same data type as the accumulator/output. This includes the zero
+    point offsets common to quantized operations.
+  implements:
+  - LinalgConvolutionOpInterface
+structured_op: !LinalgStructuredOpConfig
+  args:
+  - !LinalgOperandDefConfig
+    name: I
+    kind: input_tensor
+    type_var: T1
+    shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10] -> (s0,
+      s1, s2 * s3 + s4 * s5, s6 * s7 + s8 * s9)>
+  - !LinalgOperandDefConfig
+    name: K
+    kind: input_tensor
+    type_var: T2
+    shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10] -> (s10,
+      s1, s4, s8)>
+  - !LinalgOperandDefConfig
+    name: IZp
+    kind: scalar
+    type_var: I32
+  - !LinalgOperandDefConfig
+    name: KZp
+    kind: scalar
+    type_var: I32
+  - !LinalgOperandDefConfig
+    name: O
+    kind: output_tensor
+    type_var: U
+    shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10] -> (s0,
+      s10, s2, s6)>
+  - !LinalgOperandDefConfig
+    name: strides
+    kind: index_attr
+    index_attr_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10] ->
+      (s3, s7)>
+    default_indices:
+    - 1
+    - 1
+  - !LinalgOperandDefConfig
+    name: dilations
+    kind: index_attr
+    index_attr_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10] ->
+      (s5, s9)>
+    default_indices:
+    - 1
+    - 1
+  indexing_maps: !LinalgIndexingMapsConfig
+    static_indexing_maps:
+    - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8,
+      s9, s10] -> (d0, d4, d2 * s3 + d5 * s5, d3 * s7 + d6 * s9)>
+    - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8,
+      s9, s10] -> (d1, d4, d5, d6)>
+    - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8,
+      s9, s10] -> ()>
+    - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8,
+      s9, s10] -> ()>
+    - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8,
+      s9, s10] -> (d0, d1, d2, d3)>
+  iterator_types:
+  - parallel
+  - parallel
+  - parallel
+  - parallel
+  - reduction
+  - reduction
+  - reduction
+  assignments:
+  - !ScalarAssign
+    arg: O
+    value: !ScalarExpression
+      scalar_fn:
+        kind: binary
+        fn_name: add
+        operands:
+        - !ScalarExpression
+          scalar_arg: O
+        - !ScalarExpression
+          scalar_fn:
+            kind: binary
+            fn_name: mul
+            operands:
+            - !ScalarExpression
+              scalar_fn:
+                kind: binary
+                fn_name: sub
+                operands:
+                - !ScalarExpression
+                  scalar_fn:
+                    kind: type
+                    fn_name: cast_signed
+                    type_var: U
+                    operands:
+                    - !ScalarExpression
+                      scalar_arg: I
+                - !ScalarExpression
+                  scalar_fn:
+                    kind: type
+                    fn_name: cast_signed
+                    type_var: U
+                    operands:
+                    - !ScalarExpression
+                      scalar_arg: IZp
+            - !ScalarExpression
+              scalar_fn:
+                kind: binary
+                fn_name: sub
+                operands:
+                - !ScalarExpression
+                  scalar_fn:
+                    kind: type
+                    fn_name: cast_signed
+                    type_var: U
+                    operands:
+                    - !ScalarExpression
+                      scalar_arg: K
+                - !ScalarExpression
+                  scalar_fn:
+                    kind: type
+                    fn_name: cast_signed
+                    type_var: U
+                    operands:
+                    - !ScalarExpression
+                      scalar_arg: KZp
+--- !LinalgOpConfig
 metadata: !LinalgOpMetadata
   name: conv_2d_nchw_fchw
   cpp_class_name: Conv2DNchwFchwOp
diff --git a/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py b/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
index 67bde8f736ef46..67bae10ad16ca2 100644
--- a/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
+++ b/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
@@ -875,6 +875,34 @@ def conv_2d_nhwc_fhwc_q(
         - TypeFn.cast_signed(U, IZp)
     ) * (TypeFn.cast_signed(U, K[D.f, D.kh, D.kw, D.c]) - TypeFn.cast_signed(U, KZp))
 
+@linalg_structured_op
+def conv_2d_nchw_fchw_q(
+    I=TensorDef(T1, S.N, S.C, S.OH * S.SH + S.KH * S.DH, S.OW * S.SW + S.KW * S.DW),
+    K=TensorDef(T2, S.F, S.C, S.KH, S.KW),
+    IZp=ScalarDef(I32),
+    KZp=ScalarDef(I32),
+    O=TensorDef(U, S.N, S.F, S.OH, S.OW, output=True),
+    strides=IndexAttrDef(S.SH, S.SW, default=[1, 1]),
+    dilations=IndexAttrDef(S.DH, S.DW, default=[1, 1]),
+):
+    """Performs 2-D convolution with zero point offsets.
+
+    Layout:
+      * Input: NCHW.
+      * Kernel: FCHW.
+
+    Numeric casting is performed on the operands to the inner multiply, promoting
+    them to the same data type as the accumulator/output. This includes the zero
+    point offsets common to quantized operations.
+    """
+    implements(ConvolutionOpInterface)
+    domain(D.n, D.f, D.oh, D.ow, D.c, D.kh, D.kw)
+    O[D.n, D.f, D.oh, D.ow] += (
+        TypeFn.cast_signed(
+            U, I[D.n, D.c, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW]
+        )
+        - TypeFn.cast_signed(U, IZp)
+    ) * (TypeFn.cast_signed(U, K[D.f, D.c, D.kh, D.kw]) - TypeFn.cast_signed(U, KZp))
 
 @linalg_structured_op
 def conv_2d_nchw_fchw(

github-actions · 2024-09-08T07:21:34Z

✅ With the latest revision this PR passed the Python code formatter.

ubfx · 2024-09-19T12:33:57Z

For context: In torch-mlir, we currently have to use an additional transposition on weights and inits for all quantized convolutions. This is because we have no fitting quantized convolution op

https://github.com/llvm/torch-mlir/blob/5ce48dfacd971e5075786731bac2152ae855cab4/lib/Conversion/TorchToLinalg/Linear.cpp#L1165-L1167

MaheshRavishankar

Needs some round trip tests, but apart from that looks ok to me.

ubfx · 2024-10-12T06:14:14Z

ping

I've upstreamed the necessary quantized linalg Op with the "channel-first" ordering used by torch (llvm/llvm-project#107740) for 2d convolution. This patch changes the lowering for the quantized 2d case of `aten.convolution` accordingly, which saves three transpositions per convolution (input, weights, result) and therefore removes the requirement to try to optimize these away in downstream passes.

…#3807) I've upstreamed the necessary quantized linalg Op with the "channel-first" ordering used by torch (llvm/llvm-project#107740) for 2d convolution. This patch changes the lowering for the quantized 2d case of `aten.convolution` accordingly, which saves three transpositions per convolution (input, weights, result) and therefore removes the requirement to try to optimize these away in downstream passes.

EgorDuplensky · 2025-07-21T15:19:26Z

@ubfx Just wondering, why the memory layout is expressed right in the name of linalg operation?
Shouldn't it be at least an attribute? Or do we need to express it at all? Can't we propagate layouts in scope of some extra passes?

ubfx · 2025-07-21T17:02:29Z

@ubfx Just wondering, why the memory layout is expressed right in the name of linalg operation? Shouldn't it be at least an attribute? Or do we need to express it at all? Can't we propagate layouts in scope of some extra passes?

Yes I think the current solution (for all the conv ops in the dialect) isn't ideal and there have been a couple of suggestions and PRs to improve it, e.g. by expressing layout in attributes . Unfortunately, linalg seems to be where all of the individual interests collide which has lead to a certain degree of inertia, so I'm not sure whether fixing this is still actively worked on.

[mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order

7732eca

This patch adds a quantized version of the `linalg.conv2d_nchw_fchw` Op. This is the "channel-first" ordering typically used by PyTorch and others.

ubfx requested review from ftynse, makslevental, stellaraccident, dcaballe, nicolasvasilache and rengolin as code owners September 8, 2024 07:16

llvmbot added mlir:linalg mlir:python MLIR Python bindings mlir labels Sep 8, 2024

fix formatting

455938e

ubfx requested a review from rsuderman September 16, 2024 07:03

ubfx requested review from stellaraccident, ftynse and makslevental and removed request for stellaraccident, ftynse and makslevental September 19, 2024 13:40

MaheshRavishankar reviewed Sep 24, 2024

View reviewed changes

add roundtrip tests

909192a

ubfx requested a review from MaheshRavishankar September 27, 2024 16:28

stellaraccident approved these changes Oct 12, 2024

View reviewed changes

ubfx merged commit 02bf3b5 into llvm:main Oct 19, 2024
8 checks passed

ubfx deleted the linalg-add-conv2d-nchw-fchw-q branch October 19, 2024 16:25

ubfx mentioned this pull request Oct 20, 2024

[TorchToLinalg] Use Op with native channel order for quantized conv2d llvm/torch-mlir#3807

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order #107740

[mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order #107740

Uh oh!

ubfx commented Sep 8, 2024

Uh oh!

llvmbot commented Sep 8, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Sep 8, 2024 •

edited

Loading

Uh oh!

ubfx commented Sep 19, 2024 •

edited

Loading

Uh oh!

MaheshRavishankar left a comment

Uh oh!

ubfx commented Oct 12, 2024

Uh oh!

Uh oh!

EgorDuplensky commented Jul 21, 2025

Uh oh!

ubfx commented Jul 21, 2025

Uh oh!

Uh oh!

[mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order #107740

[mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order #107740

Uh oh!

Conversation

ubfx commented Sep 8, 2024

Uh oh!

llvmbot commented Sep 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ubfx commented Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaheshRavishankar left a comment

Choose a reason for hiding this comment

Uh oh!

ubfx commented Oct 12, 2024

Uh oh!

Uh oh!

EgorDuplensky commented Jul 21, 2025

Uh oh!

ubfx commented Jul 21, 2025

Uh oh!

Uh oh!

llvmbot commented Sep 8, 2024 •

edited

Loading

github-actions bot commented Sep 8, 2024 •

edited

Loading

ubfx commented Sep 19, 2024 •

edited

Loading