[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221

lukel97 · 2024-09-27T08:56:51Z

This is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.

llvmbot · 2024-09-27T08:57:26Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

This is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.

Patch is 29.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110221.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+16-5)
(modified) llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll (+480-104)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index bd796efd836c75..65b9012b4b3310 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1076,9 +1076,9 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
       setOperationAction({ISD::SINT_TO_FP, ISD::UINT_TO_FP, ISD::VP_SINT_TO_FP,
                           ISD::VP_UINT_TO_FP},
                          VT, Custom);
-      setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
-                          ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_INTERLEAVE,
-                          ISD::VECTOR_DEINTERLEAVE},
+      setOperationAction({ISD::INSERT_VECTOR_ELT, ISD::CONCAT_VECTORS,
+                          ISD::INSERT_SUBVECTOR, ISD::EXTRACT_SUBVECTOR,
+                          ISD::VECTOR_INTERLEAVE, ISD::VECTOR_DEINTERLEAVE},
                          VT, Custom);
       MVT EltVT = VT.getVectorElementType();
       if (isTypeLegal(EltVT))
@@ -8756,8 +8756,10 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
                                                     SelectionDAG &DAG) const {
   SDLoc DL(Op);
   MVT VecVT = Op.getSimpleValueType();
+  MVT XLenVT = Subtarget.getXLenVT();
   SDValue Vec = Op.getOperand(0);
   SDValue Val = Op.getOperand(1);
+  MVT ValVT = Val.getSimpleValueType();
   SDValue Idx = Op.getOperand(2);
 
   if (VecVT.getVectorElementType() == MVT::i1) {
@@ -8769,6 +8771,17 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
     return DAG.getNode(ISD::TRUNCATE, DL, VecVT, Vec);
   }
 
+  if ((ValVT == MVT::f16 && !Subtarget.hasVInstructionsF16()) ||
+      ValVT == MVT::bf16) {
+    // If we don't have vfmv.s.f for f16/bf16, insert into fmv.x.h first
+    MVT IntVT = VecVT.changeTypeToInteger();
+    // SDValue IntVal = DAG.getBitcast(IntVT.getVectorElementType(), Val);
+    SDValue IntInsert = DAG.getNode(
+        ISD::INSERT_VECTOR_ELT, DL, IntVT, DAG.getBitcast(IntVT, Vec),
+        DAG.getNode(RISCVISD::FMV_X_ANYEXTH, DL, XLenVT, Val), Idx);
+    return DAG.getBitcast(VecVT, IntInsert);
+  }
+
   MVT ContainerVT = VecVT;
   // If the operand is a fixed-length vector, convert to a scalable one.
   if (VecVT.isFixedLengthVector()) {
@@ -8812,8 +8825,6 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
                         AlignedIdx);
   }
 
-  MVT XLenVT = Subtarget.getXLenVT();
-
   bool IsLegalInsert = Subtarget.is64Bit() || Val.getValueType() != MVT::i64;
   // Even i64-element vectors on RV32 can be lowered without scalar
   // legalization if the most-significant 32 bits of the value are not affected
diff --git a/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll b/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
index 8cfa88e6f95697..607e0085c3f468 100644
--- a/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
@@ -1,209 +1,585 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfh,+v -target-abi=ilp32d \
-; RUN:     -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfh,+v -target-abi=lp64d \
-; RUN:     -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zfbfmin,+zvfh,+zvfbfmin,+v -target-abi=ilp32d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFH
+; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zfbfmin,+zvfh,+zvfbfmin,+v -target-abi=lp64d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFH
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zfbfmin,+zvfhmin,+zvfbfmin,+v -target-abi=ilp32d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFHMIN
+; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zfbfmin,+zvfhmin,+zvfbfmin,+v -target-abi=lp64d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFHMIN
 
-define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_0:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_0(<vscale x 1 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv1bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 0
-  ret <vscale x 1 x half> %r
+  %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 1 x bfloat> %r
 }
 
-define <vscale x 1 x half> @insertelt_nxv1f16_imm(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_imm:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_imm(<vscale x 1 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv1bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, mf4, tu, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    vmv.s.x v9, a0
 ; CHECK-NEXT:    vslideup.vi v8, v9, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 3
-  ret <vscale x 1 x half> %r
+  %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 1 x bfloat> %r
 }
 
-define <vscale x 1 x half> @insertelt_nxv1f16_idx(<vscale x 1 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv1f16_idx:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_idx(<vscale x 1 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv1bf16_idx:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    addi a1, a0, 1
-; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    fmv.x.h a2, fa0
+; CHECK-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v9, a2
 ; CHECK-NEXT:    vsetvli zero, a1, e16, mf4, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v9, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 %idx
-  ret <vscale x 1 x half> %r
+  %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 1 x bfloat> %r
 }
 
-define <vscale x 2 x half> @insertelt_nxv2f16_0(<vscale x 2 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv2f16_0:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_0(<vscale x 2 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv2bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 0
-  ret <vscale x 2 x half> %r
+  %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 2 x bfloat> %r
 }
 
-define <vscale x 2 x half> @insertelt_nxv2f16_imm(<vscale x 2 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv2f16_imm:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_imm(<vscale x 2 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv2bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, mf2, tu, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    vmv.s.x v9, a0
 ; CHECK-NEXT:    vslideup.vi v8, v9, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 3
-  ret <vscale x 2 x half> %r
+  %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 2 x bfloat> %r
 }
 
-define <vscale x 2 x half> @insertelt_nxv2f16_idx(<vscale x 2 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv2f16_idx:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_idx(<vscale x 2 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv2bf16_idx:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    addi a1, a0, 1
-; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    fmv.x.h a2, fa0
+; CHECK-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v9, a2
 ; CHECK-NEXT:    vsetvli zero, a1, e16, mf2, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v9, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 %idx
-  ret <vscale x 2 x half> %r
+  %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 2 x bfloat> %r
 }
 
-define <vscale x 4 x half> @insertelt_nxv4f16_0(<vscale x 4 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv4f16_0:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_0(<vscale x 4 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv4bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 4 x half> %v, half %elt, i32 0
-  ret <vscale x 4 x half> %r
+  %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 4 x bfloat> %r
 }
 
-define <vscale x 4 x half> @insertelt_nxv4f16_imm(<vscale x 4 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv4f16_imm:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_imm(<vscale x 4 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv4bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    vmv.s.x v9, a0
 ; CHECK-NEXT:    vslideup.vi v8, v9, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 4 x half> %v, half %elt, i32 3
-  ret <vscale x 4 x half> %r
+  %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 4 x bfloat> %r
 }
 
-define <vscale x 4 x half> @insertelt_nxv4f16_idx(<vscale x 4 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv4f16_idx:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_idx(<vscale x 4 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv4bf16_idx:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    addi a1, a0, 1
-; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    fmv.x.h a2, fa0
+; CHECK-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v9, a2
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m1, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v9, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 4 x half> %v, half %elt, i32 %idx
-  ret <vscale x 4 x half> %r
+  %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 4 x bfloat> %r
 }
 
-define <vscale x 8 x half> @insertelt_nxv8f16_0(<vscale x 8 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv8f16_0:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_0(<vscale x 8 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv8bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 8 x half> %v, half %elt, i32 0
-  ret <vscale x 8 x half> %r
+  %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 8 x bfloat> %r
 }
 
-define <vscale x 8 x half> @insertelt_nxv8f16_imm(<vscale x 8 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv8f16_imm:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_imm(<vscale x 8 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv8bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v10, fa0
+; CHECK-NEXT:    vmv.s.x v10, a0
 ; CHECK-NEXT:    vslideup.vi v8, v10, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 8 x half> %v, half %elt, i32 3
-  ret <vscale x 8 x half> %r
+  %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 8 x bfloat> %r
 }
 
-define <vscale x 8 x half> @insertelt_nxv8f16_idx(<vscale x 8 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv8f16_idx:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_idx(<vscale x 8 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv8bf16_idx:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v10, fa0
+; CHECK-NEXT:    fmv.x.h a1, fa0
+; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v10, a1
 ; CHECK-NEXT:    addi a1, a0, 1
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m2, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v10, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 8 x half> %v, half %elt, i32 %idx
-  ret <vscale x 8 x half> %r
+  %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 8 x bfloat> %r
 }
 
-define <vscale x 16 x half> @insertelt_nxv16f16_0(<vscale x 16 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv16f16_0:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_0(<vscale x 16 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv16bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 16 x half> %v, half %elt, i32 0
-  ret <vscale x 16 x half> %r
+  %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 16 x bfloat> %r
 }
 
-define <vscale x 16 x half> @insertelt_nxv16f16_imm(<vscale x 16 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv16f16_imm:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_imm(<vscale x 16 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv16bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v12, fa0
+; CHECK-NEXT:    vmv.s.x v12, a0
 ; CHECK-NEXT:    vslideup.vi v8, v12, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 16 x half> %v, half %elt, i32 3
-  ret <vscale x 16 x half> %r
+  %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 16 x bfloat> %r
 }
 
-define <vscale x 16 x half> @insertelt_nxv16f16_idx(<vscale x 16 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv16f16_idx:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_idx(<vscale x 16 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv16bf16_idx:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v12, fa0
+; CHECK-NEXT:    fmv.x.h a1, fa0
+; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v12, a1
 ; CHECK-NEXT:    addi a1, a0, 1
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m4, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v12, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 16 x half> %v, half %elt, i32 %idx
-  ret <vscale x 16 x half> %r
+  %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 16 x bfloat> %r
 }
 
-define <vscale x 32 x half> @insertelt_nxv32f16_0(<vscale x 32 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv32f16_0:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_0(<vscale x 32 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv32bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 32 x half> %v, half %elt, i32 0
-  ret <vscale x 32 x half> %r
+  %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 32 x bfloat> %r
 }
 
-define <vscale x 32 x half> @insertelt_nxv32f16_imm(<vscale x 32 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv32f16_imm:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_imm(<vscale x 32 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv32bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v16, fa0
+; CHECK-NEXT:    vmv.s.x v16, a0
 ; CHECK-NEXT:    vslideup.vi v8, v16, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 32 x half> %v, half %elt, i32 3
-  ret <vscale x 32 x half> %r
+  %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 32 x bfloat> %r
 }
 
-define <vscale x 32 x half> @insertelt_nxv32f16_idx(<vscale x 32 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv32f16_idx:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_idx(<vscale x 32 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv32bf16_idx:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v16, fa0
+; CHECK-NEXT:    fmv.x.h a1, fa0
+; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v16, a1
 ; CHECK-NEXT:    addi a1, a0, 1
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m8, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v16, a0
 ; CHECK-NEXT:    ret
+  %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 32 x bfloat> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv1f16_0:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
+; ZVFH-NEXT:    vfmv.s.f v8, fa0
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_0:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    fmv.x.h a0, fa0
+; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; ZVFHMIN-NEXT:    vmv.s.x v8, a0
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 0
+  ret <vscale x 1 x half> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_imm(<vscale x 1 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv1f16_imm:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 4, e16, mf4, tu, ma
+; ZVFH-NEXT:    vfmv.s.f v9, fa0
+; ZVFH-NEXT:    vslideup.vi v8, v9, 3
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_imm:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    fmv.x.h a0, fa0
+; ZVFHMIN-NEXT:    vsetivli zero, 4, e16, mf4, tu, ma
+; ZVFHMIN-NEXT:    vmv.s.x v9, a0
+; ZVFHMIN-NEXT:    vslideup.vi v8, v9, 3
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 3
+  ret <vscale x 1 x half> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_idx(<vscale x 1 x half> %v, half %elt, i32 zeroext %idx) {
+; ZVFH-LABEL: insertelt_nxv1f16_idx:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    addi a1, a0, 1
+; ZVFH-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; ZVFH-NEXT:    vfmv.s.f v9, fa0
+; ZVFH-NEXT:    vsetvli zero, a1, e16, mf4, tu, ma
+; ZVFH-NEXT:    vslideup.vx v8, v9, a0
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_idx:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    addi a1, a0, 1
+; ZVFHMIN-NEXT:    fmv.x.h a2, fa0
+; ZVFHMIN-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; ZVFHMIN-NEXT:    vmv.s.x v9, a2
+; ZVFHMIN-NEXT:    vsetvli zero, a1, e16, mf4, tu, ma
+; ZVFHMIN-NEXT:    vslideup.vx v8, v9, a0
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 %idx
+  ret <vscale x 1 x half> %r
+}
+
+define <vscale x 2 x half> @insertelt_nxv2f16_0(<vscale x 2 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv2f16_0:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
+; ZVFH-NEXT:    vfmv.s.f v8, fa0
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv2f16_0:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    fmv.x.h a0, fa0
+; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; ZVFHMIN-NEXT:    vmv.s.x v8, a0
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 0
+  ret <vscale x 2 x half> %r
+}
+
+define <vscale x 2 x half> @insertelt_nxv2f16_imm(<vscale x 2 x half> %v, half %elt) {
+; ZVFH-LABEL: ...
[truncated]

preames · 2024-09-27T14:56:27Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+      ValVT == MVT::bf16) {
+    // If we don't have vfmv.s.f for f16/bf16, insert into fmv.x.h first
+    MVT IntVT = VecVT.changeTypeToInteger();
+    // SDValue IntVal = DAG.getBitcast(IntVT.getVectorElementType(), Val);


Commented out code?

Woops, removed

preames

LGTM w/minor comment.

preames · 2024-09-30T14:53:19Z

llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll


-define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_0:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_0(<vscale x 1 x bfloat> %v, bfloat %elt) {


These test diffs are confusing, can you put the bfloat tests after the half tests just to reduce the delta?

I've moved it but unfortunately the test diffs are somewhat confusing. I think the diff is getting confused because bfloat and half share most of the same check lines.

This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.

llvm-ci · 2024-10-02T07:44:05Z

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu running on linaro-lldb-arm-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/4793

Here is the relevant piece of the build log for the reference

Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/namespace_conflicts/TestNamespaceConflicts.py (804 of 2813)
PASS: lldb-api :: lang/cpp/namespace/TestNamespace.py (805 of 2813)
PASS: lldb-api :: lang/cpp/namespace_definitions/TestNamespaceDefinitions.py (806 of 2813)
PASS: lldb-api :: lang/cpp/nested-class-other-compilation-unit/TestNestedClassWithParentInAnotherCU.py (807 of 2813)
PASS: lldb-api :: lang/cpp/nested-template/TestNestedTemplate.py (808 of 2813)
PASS: lldb-api :: lang/cpp/non-type-template-param/TestCppNonTypeTemplateParam.py (809 of 2813)
PASS: lldb-api :: lang/cpp/no_unique_address/TestNoUniqueAddress.py (810 of 2813)
PASS: lldb-api :: lang/cpp/nsimport/TestCppNsImport.py (811 of 2813)
PASS: lldb-api :: lang/cpp/offsetof/TestOffsetofCpp.py (812 of 2813)
PASS: lldb-api :: lang/cpp/operator-overload/TestOperatorOverload.py (813 of 2813)
FAIL: lldb-api :: lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py (814 of 2813)
******************** TEST 'lldb-api :: lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env ARCHIVER=/usr/local/bin/llvm-ar --env OBJCOPY=/usr/bin/llvm-objcopy --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/lang/c/shared_lib_stripped_symbols -p TestSharedLibStrippedSymbols.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 1fa4a74d53184df0ea3dcb7eb6faf66d6974e7bb)
  clang revision 1fa4a74d53184df0ea3dcb7eb6faf66d6974e7bb
  llvm revision 1fa4a74d53184df0ea3dcb7eb6faf66d6974e7bb
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dsym (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dsym (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase) (test case does not fall in any category of interest for this run) 
XFAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
XFAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
======================================================================
FAIL: test_expr_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
   Test that types work when defined in a shared library and forwa/d-declared in the main executable
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1769, in test_method
    return attrvalue(self)
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py", line 24, in test_expr
    self.expect(
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2370, in expect
    self.runCmd(
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1000, in runCmd
    self.assertTrue(self.res.Succeeded(), msg + output)
AssertionError: False is not true : Variable(s) displayed correctly
Error output:

This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.

…bfmin RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles f16 and bf16 scalable vectors after llvm#110221, so we can reuse it for fixed-length vectors.

…bfmin (#114927) RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles f16 and bf16 scalable vectors after #110221, so we can reuse it for fixed-length vectors.

…bfmin (llvm#114927) RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles f16 and bf16 scalable vectors after llvm#110221, so we can reuse it for fixed-length vectors.

lukel97 requested review from asb, preames, topperc and wangpc-pp September 27, 2024 08:56

llvmbot added the backend:RISC-V label Sep 27, 2024

preames requested changes Sep 27, 2024

View reviewed changes

preames approved these changes Sep 30, 2024

View reviewed changes

lukel97 added 3 commits October 2, 2024 14:30

Remove extraneous comment, adjust other comment

1112dd8

Move bfloat below half to tame diff

b4716f5

lukel97 force-pushed the zvfhmin-zvfbfmin/insert_vector_elt branch from cdeed50 to b4716f5 Compare October 2, 2024 06:36

lukel97 merged commit 1fa4a74 into llvm:main Oct 2, 2024
5 of 8 checks passed

lukel97 mentioned this pull request Nov 5, 2024

[RISCV] Lower fixed-length {insert,extract}_vector_elt on zvfhmin/zvfbfmin #114927

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221

[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221

lukel97 commented Sep 27, 2024

llvmbot commented Sep 27, 2024

preames Sep 27, 2024

lukel97 Sep 30, 2024

preames left a comment

preames Sep 30, 2024

lukel97 Oct 2, 2024

llvm-ci commented Oct 2, 2024

[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221

[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221

Conversation

lukel97 commented Sep 27, 2024

llvmbot commented Sep 27, 2024

preames Sep 27, 2024

Choose a reason for hiding this comment

lukel97 Sep 30, 2024

Choose a reason for hiding this comment

preames left a comment

Choose a reason for hiding this comment

preames Sep 30, 2024

Choose a reason for hiding this comment

lukel97 Oct 2, 2024

Choose a reason for hiding this comment

llvm-ci commented Oct 2, 2024