-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThis is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests. Patch is 29.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110221.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index bd796efd836c75..65b9012b4b3310 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1076,9 +1076,9 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
setOperationAction({ISD::SINT_TO_FP, ISD::UINT_TO_FP, ISD::VP_SINT_TO_FP,
ISD::VP_UINT_TO_FP},
VT, Custom);
- setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
- ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_INTERLEAVE,
- ISD::VECTOR_DEINTERLEAVE},
+ setOperationAction({ISD::INSERT_VECTOR_ELT, ISD::CONCAT_VECTORS,
+ ISD::INSERT_SUBVECTOR, ISD::EXTRACT_SUBVECTOR,
+ ISD::VECTOR_INTERLEAVE, ISD::VECTOR_DEINTERLEAVE},
VT, Custom);
MVT EltVT = VT.getVectorElementType();
if (isTypeLegal(EltVT))
@@ -8756,8 +8756,10 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
SelectionDAG &DAG) const {
SDLoc DL(Op);
MVT VecVT = Op.getSimpleValueType();
+ MVT XLenVT = Subtarget.getXLenVT();
SDValue Vec = Op.getOperand(0);
SDValue Val = Op.getOperand(1);
+ MVT ValVT = Val.getSimpleValueType();
SDValue Idx = Op.getOperand(2);
if (VecVT.getVectorElementType() == MVT::i1) {
@@ -8769,6 +8771,17 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
return DAG.getNode(ISD::TRUNCATE, DL, VecVT, Vec);
}
+ if ((ValVT == MVT::f16 && !Subtarget.hasVInstructionsF16()) ||
+ ValVT == MVT::bf16) {
+ // If we don't have vfmv.s.f for f16/bf16, insert into fmv.x.h first
+ MVT IntVT = VecVT.changeTypeToInteger();
+ // SDValue IntVal = DAG.getBitcast(IntVT.getVectorElementType(), Val);
+ SDValue IntInsert = DAG.getNode(
+ ISD::INSERT_VECTOR_ELT, DL, IntVT, DAG.getBitcast(IntVT, Vec),
+ DAG.getNode(RISCVISD::FMV_X_ANYEXTH, DL, XLenVT, Val), Idx);
+ return DAG.getBitcast(VecVT, IntInsert);
+ }
+
MVT ContainerVT = VecVT;
// If the operand is a fixed-length vector, convert to a scalable one.
if (VecVT.isFixedLengthVector()) {
@@ -8812,8 +8825,6 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
AlignedIdx);
}
- MVT XLenVT = Subtarget.getXLenVT();
-
bool IsLegalInsert = Subtarget.is64Bit() || Val.getValueType() != MVT::i64;
// Even i64-element vectors on RV32 can be lowered without scalar
// legalization if the most-significant 32 bits of the value are not affected
diff --git a/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll b/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
index 8cfa88e6f95697..607e0085c3f468 100644
--- a/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
@@ -1,209 +1,585 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfh,+v -target-abi=ilp32d \
-; RUN: -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfh,+v -target-abi=lp64d \
-; RUN: -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zfbfmin,+zvfh,+zvfbfmin,+v -target-abi=ilp32d \
+; RUN: -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFH
+; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zfbfmin,+zvfh,+zvfbfmin,+v -target-abi=lp64d \
+; RUN: -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFH
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zfbfmin,+zvfhmin,+zvfbfmin,+v -target-abi=ilp32d \
+; RUN: -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFHMIN
+; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zfbfmin,+zvfhmin,+zvfbfmin,+v -target-abi=lp64d \
+; RUN: -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFHMIN
-define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_0:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_0(<vscale x 1 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv1bf16_0:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v8, fa0
+; CHECK-NEXT: fmv.x.h a0, fa0
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 1 x half> %v, half %elt, i32 0
- ret <vscale x 1 x half> %r
+ %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 0
+ ret <vscale x 1 x bfloat> %r
}
-define <vscale x 1 x half> @insertelt_nxv1f16_imm(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_imm:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_imm(<vscale x 1 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv1bf16_imm:
; CHECK: # %bb.0:
+; CHECK-NEXT: fmv.x.h a0, fa0
; CHECK-NEXT: vsetivli zero, 4, e16, mf4, tu, ma
-; CHECK-NEXT: vfmv.s.f v9, fa0
+; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vslideup.vi v8, v9, 3
; CHECK-NEXT: ret
- %r = insertelement <vscale x 1 x half> %v, half %elt, i32 3
- ret <vscale x 1 x half> %r
+ %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 3
+ ret <vscale x 1 x bfloat> %r
}
-define <vscale x 1 x half> @insertelt_nxv1f16_idx(<vscale x 1 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv1f16_idx:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_idx(<vscale x 1 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv1bf16_idx:
; CHECK: # %bb.0:
; CHECK-NEXT: addi a1, a0, 1
-; CHECK-NEXT: vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT: vfmv.s.f v9, fa0
+; CHECK-NEXT: fmv.x.h a2, fa0
+; CHECK-NEXT: vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetvli zero, a1, e16, mf4, tu, ma
; CHECK-NEXT: vslideup.vx v8, v9, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 1 x half> %v, half %elt, i32 %idx
- ret <vscale x 1 x half> %r
+ %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 %idx
+ ret <vscale x 1 x bfloat> %r
}
-define <vscale x 2 x half> @insertelt_nxv2f16_0(<vscale x 2 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv2f16_0:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_0(<vscale x 2 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv2bf16_0:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v8, fa0
+; CHECK-NEXT: fmv.x.h a0, fa0
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 2 x half> %v, half %elt, i32 0
- ret <vscale x 2 x half> %r
+ %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 0
+ ret <vscale x 2 x bfloat> %r
}
-define <vscale x 2 x half> @insertelt_nxv2f16_imm(<vscale x 2 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv2f16_imm:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_imm(<vscale x 2 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv2bf16_imm:
; CHECK: # %bb.0:
+; CHECK-NEXT: fmv.x.h a0, fa0
; CHECK-NEXT: vsetivli zero, 4, e16, mf2, tu, ma
-; CHECK-NEXT: vfmv.s.f v9, fa0
+; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vslideup.vi v8, v9, 3
; CHECK-NEXT: ret
- %r = insertelement <vscale x 2 x half> %v, half %elt, i32 3
- ret <vscale x 2 x half> %r
+ %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 3
+ ret <vscale x 2 x bfloat> %r
}
-define <vscale x 2 x half> @insertelt_nxv2f16_idx(<vscale x 2 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv2f16_idx:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_idx(<vscale x 2 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv2bf16_idx:
; CHECK: # %bb.0:
; CHECK-NEXT: addi a1, a0, 1
-; CHECK-NEXT: vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT: vfmv.s.f v9, fa0
+; CHECK-NEXT: fmv.x.h a2, fa0
+; CHECK-NEXT: vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetvli zero, a1, e16, mf2, tu, ma
; CHECK-NEXT: vslideup.vx v8, v9, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 2 x half> %v, half %elt, i32 %idx
- ret <vscale x 2 x half> %r
+ %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 %idx
+ ret <vscale x 2 x bfloat> %r
}
-define <vscale x 4 x half> @insertelt_nxv4f16_0(<vscale x 4 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv4f16_0:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_0(<vscale x 4 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv4bf16_0:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v8, fa0
+; CHECK-NEXT: fmv.x.h a0, fa0
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 4 x half> %v, half %elt, i32 0
- ret <vscale x 4 x half> %r
+ %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 0
+ ret <vscale x 4 x bfloat> %r
}
-define <vscale x 4 x half> @insertelt_nxv4f16_imm(<vscale x 4 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv4f16_imm:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_imm(<vscale x 4 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv4bf16_imm:
; CHECK: # %bb.0:
+; CHECK-NEXT: fmv.x.h a0, fa0
; CHECK-NEXT: vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v9, fa0
+; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vslideup.vi v8, v9, 3
; CHECK-NEXT: ret
- %r = insertelement <vscale x 4 x half> %v, half %elt, i32 3
- ret <vscale x 4 x half> %r
+ %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 3
+ ret <vscale x 4 x bfloat> %r
}
-define <vscale x 4 x half> @insertelt_nxv4f16_idx(<vscale x 4 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv4f16_idx:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_idx(<vscale x 4 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv4bf16_idx:
; CHECK: # %bb.0:
; CHECK-NEXT: addi a1, a0, 1
-; CHECK-NEXT: vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT: vfmv.s.f v9, fa0
+; CHECK-NEXT: fmv.x.h a2, fa0
+; CHECK-NEXT: vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetvli zero, a1, e16, m1, tu, ma
; CHECK-NEXT: vslideup.vx v8, v9, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 4 x half> %v, half %elt, i32 %idx
- ret <vscale x 4 x half> %r
+ %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 %idx
+ ret <vscale x 4 x bfloat> %r
}
-define <vscale x 8 x half> @insertelt_nxv8f16_0(<vscale x 8 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv8f16_0:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_0(<vscale x 8 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv8bf16_0:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v8, fa0
+; CHECK-NEXT: fmv.x.h a0, fa0
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 8 x half> %v, half %elt, i32 0
- ret <vscale x 8 x half> %r
+ %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 0
+ ret <vscale x 8 x bfloat> %r
}
-define <vscale x 8 x half> @insertelt_nxv8f16_imm(<vscale x 8 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv8f16_imm:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_imm(<vscale x 8 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv8bf16_imm:
; CHECK: # %bb.0:
+; CHECK-NEXT: fmv.x.h a0, fa0
; CHECK-NEXT: vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v10, fa0
+; CHECK-NEXT: vmv.s.x v10, a0
; CHECK-NEXT: vslideup.vi v8, v10, 3
; CHECK-NEXT: ret
- %r = insertelement <vscale x 8 x half> %v, half %elt, i32 3
- ret <vscale x 8 x half> %r
+ %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 3
+ ret <vscale x 8 x bfloat> %r
}
-define <vscale x 8 x half> @insertelt_nxv8f16_idx(<vscale x 8 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv8f16_idx:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_idx(<vscale x 8 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv8bf16_idx:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT: vfmv.s.f v10, fa0
+; CHECK-NEXT: fmv.x.h a1, fa0
+; CHECK-NEXT: vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT: vmv.s.x v10, a1
; CHECK-NEXT: addi a1, a0, 1
; CHECK-NEXT: vsetvli zero, a1, e16, m2, tu, ma
; CHECK-NEXT: vslideup.vx v8, v10, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 8 x half> %v, half %elt, i32 %idx
- ret <vscale x 8 x half> %r
+ %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 %idx
+ ret <vscale x 8 x bfloat> %r
}
-define <vscale x 16 x half> @insertelt_nxv16f16_0(<vscale x 16 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv16f16_0:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_0(<vscale x 16 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv16bf16_0:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v8, fa0
+; CHECK-NEXT: fmv.x.h a0, fa0
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 16 x half> %v, half %elt, i32 0
- ret <vscale x 16 x half> %r
+ %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 0
+ ret <vscale x 16 x bfloat> %r
}
-define <vscale x 16 x half> @insertelt_nxv16f16_imm(<vscale x 16 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv16f16_imm:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_imm(<vscale x 16 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv16bf16_imm:
; CHECK: # %bb.0:
+; CHECK-NEXT: fmv.x.h a0, fa0
; CHECK-NEXT: vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v12, fa0
+; CHECK-NEXT: vmv.s.x v12, a0
; CHECK-NEXT: vslideup.vi v8, v12, 3
; CHECK-NEXT: ret
- %r = insertelement <vscale x 16 x half> %v, half %elt, i32 3
- ret <vscale x 16 x half> %r
+ %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 3
+ ret <vscale x 16 x bfloat> %r
}
-define <vscale x 16 x half> @insertelt_nxv16f16_idx(<vscale x 16 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv16f16_idx:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_idx(<vscale x 16 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv16bf16_idx:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT: vfmv.s.f v12, fa0
+; CHECK-NEXT: fmv.x.h a1, fa0
+; CHECK-NEXT: vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT: vmv.s.x v12, a1
; CHECK-NEXT: addi a1, a0, 1
; CHECK-NEXT: vsetvli zero, a1, e16, m4, tu, ma
; CHECK-NEXT: vslideup.vx v8, v12, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 16 x half> %v, half %elt, i32 %idx
- ret <vscale x 16 x half> %r
+ %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 %idx
+ ret <vscale x 16 x bfloat> %r
}
-define <vscale x 32 x half> @insertelt_nxv32f16_0(<vscale x 32 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv32f16_0:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_0(<vscale x 32 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv32bf16_0:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v8, fa0
+; CHECK-NEXT: fmv.x.h a0, fa0
+; CHECK-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret
- %r = insertelement <vscale x 32 x half> %v, half %elt, i32 0
- ret <vscale x 32 x half> %r
+ %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 0
+ ret <vscale x 32 x bfloat> %r
}
-define <vscale x 32 x half> @insertelt_nxv32f16_imm(<vscale x 32 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv32f16_imm:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_imm(<vscale x 32 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv32bf16_imm:
; CHECK: # %bb.0:
+; CHECK-NEXT: fmv.x.h a0, fa0
; CHECK-NEXT: vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT: vfmv.s.f v16, fa0
+; CHECK-NEXT: vmv.s.x v16, a0
; CHECK-NEXT: vslideup.vi v8, v16, 3
; CHECK-NEXT: ret
- %r = insertelement <vscale x 32 x half> %v, half %elt, i32 3
- ret <vscale x 32 x half> %r
+ %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 3
+ ret <vscale x 32 x bfloat> %r
}
-define <vscale x 32 x half> @insertelt_nxv32f16_idx(<vscale x 32 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv32f16_idx:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_idx(<vscale x 32 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv32bf16_idx:
; CHECK: # %bb.0:
-; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT: vfmv.s.f v16, fa0
+; CHECK-NEXT: fmv.x.h a1, fa0
+; CHECK-NEXT: vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT: vmv.s.x v16, a1
; CHECK-NEXT: addi a1, a0, 1
; CHECK-NEXT: vsetvli zero, a1, e16, m8, tu, ma
; CHECK-NEXT: vslideup.vx v8, v16, a0
; CHECK-NEXT: ret
+ %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 %idx
+ ret <vscale x 32 x bfloat> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv1f16_0:
+; ZVFH: # %bb.0:
+; ZVFH-NEXT: vsetvli a0, zero, e16, m1, tu, ma
+; ZVFH-NEXT: vfmv.s.f v8, fa0
+; ZVFH-NEXT: ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_0:
+; ZVFHMIN: # %bb.0:
+; ZVFHMIN-NEXT: fmv.x.h a0, fa0
+; ZVFHMIN-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; ZVFHMIN-NEXT: vmv.s.x v8, a0
+; ZVFHMIN-NEXT: ret
+ %r = insertelement <vscale x 1 x half> %v, half %elt, i32 0
+ ret <vscale x 1 x half> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_imm(<vscale x 1 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv1f16_imm:
+; ZVFH: # %bb.0:
+; ZVFH-NEXT: vsetivli zero, 4, e16, mf4, tu, ma
+; ZVFH-NEXT: vfmv.s.f v9, fa0
+; ZVFH-NEXT: vslideup.vi v8, v9, 3
+; ZVFH-NEXT: ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_imm:
+; ZVFHMIN: # %bb.0:
+; ZVFHMIN-NEXT: fmv.x.h a0, fa0
+; ZVFHMIN-NEXT: vsetivli zero, 4, e16, mf4, tu, ma
+; ZVFHMIN-NEXT: vmv.s.x v9, a0
+; ZVFHMIN-NEXT: vslideup.vi v8, v9, 3
+; ZVFHMIN-NEXT: ret
+ %r = insertelement <vscale x 1 x half> %v, half %elt, i32 3
+ ret <vscale x 1 x half> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_idx(<vscale x 1 x half> %v, half %elt, i32 zeroext %idx) {
+; ZVFH-LABEL: insertelt_nxv1f16_idx:
+; ZVFH: # %bb.0:
+; ZVFH-NEXT: addi a1, a0, 1
+; ZVFH-NEXT: vsetvli a2, zero, e16, m1, ta, ma
+; ZVFH-NEXT: vfmv.s.f v9, fa0
+; ZVFH-NEXT: vsetvli zero, a1, e16, mf4, tu, ma
+; ZVFH-NEXT: vslideup.vx v8, v9, a0
+; ZVFH-NEXT: ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_idx:
+; ZVFHMIN: # %bb.0:
+; ZVFHMIN-NEXT: addi a1, a0, 1
+; ZVFHMIN-NEXT: fmv.x.h a2, fa0
+; ZVFHMIN-NEXT: vsetvli a3, zero, e16, m1, ta, ma
+; ZVFHMIN-NEXT: vmv.s.x v9, a2
+; ZVFHMIN-NEXT: vsetvli zero, a1, e16, mf4, tu, ma
+; ZVFHMIN-NEXT: vslideup.vx v8, v9, a0
+; ZVFHMIN-NEXT: ret
+ %r = insertelement <vscale x 1 x half> %v, half %elt, i32 %idx
+ ret <vscale x 1 x half> %r
+}
+
+define <vscale x 2 x half> @insertelt_nxv2f16_0(<vscale x 2 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv2f16_0:
+; ZVFH: # %bb.0:
+; ZVFH-NEXT: vsetvli a0, zero, e16, m1, tu, ma
+; ZVFH-NEXT: vfmv.s.f v8, fa0
+; ZVFH-NEXT: ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv2f16_0:
+; ZVFHMIN: # %bb.0:
+; ZVFHMIN-NEXT: fmv.x.h a0, fa0
+; ZVFHMIN-NEXT: vsetvli a1, zero, e16, m1, tu, ma
+; ZVFHMIN-NEXT: vmv.s.x v8, a0
+; ZVFHMIN-NEXT: ret
+ %r = insertelement <vscale x 2 x half> %v, half %elt, i32 0
+ ret <vscale x 2 x half> %r
+}
+
+define <vscale x 2 x half> @insertelt_nxv2f16_imm(<vscale x 2 x half> %v, half %elt) {
+; ZVFH-LABEL: ...
[truncated]
|
ValVT == MVT::bf16) { | ||
// If we don't have vfmv.s.f for f16/bf16, insert into fmv.x.h first | ||
MVT IntVT = VecVT.changeTypeToInteger(); | ||
// SDValue IntVal = DAG.getBitcast(IntVT.getVectorElementType(), Val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM w/minor comment.
|
||
define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) { | ||
; CHECK-LABEL: insertelt_nxv1f16_0: | ||
define <vscale x 1 x bfloat> @insertelt_nxv1bf16_0(<vscale x 1 x bfloat> %v, bfloat %elt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test diffs are confusing, can you put the bfloat tests after the half tests just to reduce the delta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved it but unfortunately the test diffs are somewhat confusing. I think the diff is getting confused because bfloat and half share most of the same check lines.
This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.
cdeed50
to
b4716f5
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/4793 Here is the relevant piece of the build log for the reference
|
This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.
…bfmin RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles f16 and bf16 scalable vectors after llvm#110221, so we can reuse it for fixed-length vectors.
…bfmin (llvm#114927) RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles f16 and bf16 scalable vectors after llvm#110221, so we can reuse it for fixed-length vectors.
This is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.