-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Commit ba81477
authored
Recommit "[RISCV][ISel] Combine scalable vector add/sub/mul with zero/sign extension." (#76785)
This patch was originally introduced in PR #72340, but was reverted due
to a bug on invalid extension combine.
Specifically, we resolve the case in the
#72340 (comment)
```
define <vscale x 1 x i32> @foo(<vscale x 1 x i1> %x, <vscale x 1 x i2> %y) {
%a = zext <vscale x 1 x i1> %x to <vscale x 1 x i32>
%b = zext <vscale x 1 x i1> %y to <vscale x 1 x i32>
%c = add <vscale x 1 x i32> %a, %b
ret <vscale x 1 x i32> %c
}
```
The previous patch didn't check if the semantic of `ISD::ZERO_EXTEND`
and `ISD::ZERO_EXTEND` is equivalent to the `vsext.vf2` or `vzext.vf2`
(not ensuring the SEW condition on widening Vector Arithmetic
Instructions).
Thanks for @topperc pointing out this bug.
## The original description
This PR mainly aims at resolving the below missed-optimization case,
while it could also be considered as an extension of the previous patch
https://reviews.llvm.org/D133739?id=
### Missed-Optimization Case
Compiler Explorer: https://godbolt.org/z/GzWzP7Pfh
### Source Code:
```
define <vscale x 2 x i16> @multiple_users(ptr %x, ptr %y, ptr %z) {
%a = load <vscale x 2 x i8>, ptr %x
%b = load <vscale x 2 x i8>, ptr %y
%b2 = load <vscale x 2 x i8>, ptr %z
%c = sext <vscale x 2 x i8> %a to <vscale x 2 x i16>
%d = sext <vscale x 2 x i8> %b to <vscale x 2 x i16>
%d2 = sext <vscale x 2 x i8> %b2 to <vscale x 2 x i16>
%e = mul <vscale x 2 x i16> %c, %d
%f = add <vscale x 2 x i16> %c, %d2
%g = sub <vscale x 2 x i16> %c, %d2
%h = or <vscale x 2 x i16> %e, %f
%i = or <vscale x 2 x i16> %h, %g
ret <vscale x 2 x i16> %i
}
```
### Before This Patch
```
# %bb.0:
vsetvli a3, zero, e16, mf2, ta, ma
vle8.v v8, (a0)
vle8.v v9, (a1)
vle8.v v10, (a2)
svf2 v11, v8
vsext.vf2 v8, v9
vsext.vf2 v9, v10
vmul.vv v8, v11, v8
vadd.vv v10, v11, v9
vsub.vv v9, v11, v9
vor.vv v8, v8, v10
vor.vv v8, v8, v9
ret
```
### After This Patch
```
# %bb.0:
vsetvli a3, zero, e8, mf4, ta, ma
vle8.v v8, (a0)
vle8.v v9, (a1)
vle8.v v10, (a2)
vwmul.vv v11, v8, v9
vwadd.vv v9, v8, v10
vwsub.vv v12, v8, v10
vsetvli zero, zero, e16, mf2, ta, ma
vor.vv v8, v11, v9
vor.vv v8, v8, v12
ret
```
We can see Add/Sub/Mul are combined with the Sign Extension.
### Relation to the Patch D133739
The patch D133739 introduced an optimization for folding `ADD_VL`/
`SUB_VL` / `MUL_V` with `VSEXT_VL` / `VZEXT_VL`. However, the patch did
not consider the case of non-fixed length vector case, thus this PR
could also be considered as an extension for the D133739.1 parent 2d92f7d commit ba81477Copy full SHA for ba81477
File tree
3 files changed
+760
-155
lines changedFilter options
- llvm
- lib/Target/RISCV
- test/CodeGen/RISCV/rvv
3 files changed
+760
-155
lines changed
0 commit comments