-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISC-V] expandload
should compile to viota
+vrgather
#101914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
@llvm/issue-subscribers-backend-risc-v Author: Niles Salter (Validark)
```zig
export fn expandload16(a: *const [16]u8, b: u16, c: @Vector(16, u8)) @Vector(16, u8) {
return struct {
extern fn @"llvm.masked.expandload.v16i8"(@TypeOf(a), @Vector(16, u1), @TypeOf(c)) callconv(.Unspecified) @Vector(16, u8);
}.@"llvm.masked.expandload.v16i8"(a, @as(@Vector(16, u1), @bitCast(b)), c);
}
```
define dso_local <16 x i8> @<!-- -->expandload16(ptr nocapture nonnull readonly align 1 %0, i16 zeroext %1, <16 x i8> %2) local_unnamed_addr {
Entry:
%3 = bitcast i16 %1 to <16 x i1>
%4 = tail call fastcc <16 x i8> @<!-- -->llvm.masked.expandload.v16i8(ptr nonnull readonly align 1 %0, <16 x i1> %3, <16 x i8> %2)
ret <16 x i8> %4
}
declare void @<!-- -->llvm.dbg.value(metadata, metadata, metadata) #<!-- -->1
declare fastcc <16 x i8> @<!-- -->llvm.masked.expandload.v16i8(ptr nocapture, <16 x i1>, <16 x i8>) #<!-- -->2 When compiled for the Sifive x280, we check bit-by-bit and jump based on that: ...
...
...
.LBB0_2:
andi a2, a1, 4
bnez a2, .LBB0_20
.LBB0_3:
andi a2, a1, 8
bnez a2, .LBB0_21
.LBB0_4:
andi a2, a1, 16
bnez a2, .LBB0_22
.LBB0_5:
andi a2, a1, 32
bnez a2, .LBB0_23
.LBB0_6:
andi a2, a1, 64
bnez a2, .LBB0_24
.LBB0_7:
andi a2, a1, 128
bnez a2, .LBB0_25
.LBB0_8:
andi a2, a1, 256
bnez a2, .LBB0_26
.LBB0_9:
andi a2, a1, 512
bnez a2, .LBB0_27
.LBB0_10:
andi a2, a1, 1024
bnez a2, .LBB0_28
...
...
... It should be able to work according to the documentation here: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1651-synthesizing-vdecompress |
wangpc-pp
added a commit
to wangpc-pp/llvm-project
that referenced
this issue
Aug 5, 2024
We can use `iota+vrgather` to synthesize `vdecompress` and lower expanding load to `vcpop+load+vdecompress`. Fixes llvm#101914
wangpc-pp
added a commit
to wangpc-pp/llvm-project
that referenced
this issue
Aug 5, 2024
We can use `viota.m` + indexed load to synthesize expanding load: ``` %res = llvm.masked.expandload(%ptr, %mask, %passthru) -> %index = viota %mask if elt_size > 8: %index = vsll.vi %index, log2(elt_size), %mask %res = vluxei<n> %passthru, %ptr, %index, %mask ``` And if `%mask` is all ones, we can lower expanding load to an normal unmasked load. Fixes llvm#101914
wangpc-pp
added a commit
to wangpc-pp/llvm-project
that referenced
this issue
Aug 6, 2024
We can use `viota.m` + indexed load to synthesize expanding load: ``` %res = llvm.masked.expandload(%ptr, %mask, %passthru) -> %index = viota %mask if elt_size > 8: %index = vsll.vi %index, log2(elt_size), %mask %res = vluxei<n> %passthru, %ptr, %index, %mask ``` And if `%mask` is all ones, we can lower expanding load to a normal unmasked load. Fixes llvm#101914
wangpc-pp
added a commit
to wangpc-pp/llvm-project
that referenced
this issue
Oct 23, 2024
We can use `viota.m` + indexed load to synthesize expanding load: ``` %res = llvm.masked.expandload(%ptr, %mask, %passthru) -> %index = viota %mask if elt_size > 8: %index = vsll.vi %index, log2(elt_size), %mask %res = vluxei<n> %passthru, %ptr, %index, %mask ``` And if `%mask` is all ones, we can lower expanding load to a normal unmasked load. Fixes llvm#101914
wangpc-pp
added a commit
to wangpc-pp/llvm-project
that referenced
this issue
Oct 31, 2024
We can use `viota`+`vrgather` to synthesize `vdecompress` and lower expanding load to `vcpop`+`load`+`vdecompress`. And if `%mask` is all ones, we can lower expanding load to a normal unmasked load. Fixes llvm#101914.
smallp-o-p
pushed a commit
to smallp-o-p/llvm-project
that referenced
this issue
Nov 3, 2024
We can use `viota`+`vrgather` to synthesize `vdecompress` and lower expanding load to `vcpop`+`load`+`vdecompress`. And if `%mask` is all ones, we can lower expanding load to a normal unmasked load. Fixes llvm#101914.
NoumanAmir657
pushed a commit
to NoumanAmir657/llvm-project
that referenced
this issue
Nov 4, 2024
We can use `viota`+`vrgather` to synthesize `vdecompress` and lower expanding load to `vcpop`+`load`+`vdecompress`. And if `%mask` is all ones, we can lower expanding load to a normal unmasked load. Fixes llvm#101914.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When compiled for the Sifive x280, we check bit-by-bit and jump based on that:
It should be able to work according to the documentation here:
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1651-synthesizing-vdecompress
The text was updated successfully, but these errors were encountered: