Skip to content

[RISCV] Lower insert_vector_elt on zvfhmin/zvfbfmin #110221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 2, 2024

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Sep 27, 2024

This is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.

@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

This is the dual of #110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.


Patch is 29.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110221.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+16-5)
  • (modified) llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll (+480-104)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index bd796efd836c75..65b9012b4b3310 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1076,9 +1076,9 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
       setOperationAction({ISD::SINT_TO_FP, ISD::UINT_TO_FP, ISD::VP_SINT_TO_FP,
                           ISD::VP_UINT_TO_FP},
                          VT, Custom);
-      setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
-                          ISD::EXTRACT_SUBVECTOR, ISD::VECTOR_INTERLEAVE,
-                          ISD::VECTOR_DEINTERLEAVE},
+      setOperationAction({ISD::INSERT_VECTOR_ELT, ISD::CONCAT_VECTORS,
+                          ISD::INSERT_SUBVECTOR, ISD::EXTRACT_SUBVECTOR,
+                          ISD::VECTOR_INTERLEAVE, ISD::VECTOR_DEINTERLEAVE},
                          VT, Custom);
       MVT EltVT = VT.getVectorElementType();
       if (isTypeLegal(EltVT))
@@ -8756,8 +8756,10 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
                                                     SelectionDAG &DAG) const {
   SDLoc DL(Op);
   MVT VecVT = Op.getSimpleValueType();
+  MVT XLenVT = Subtarget.getXLenVT();
   SDValue Vec = Op.getOperand(0);
   SDValue Val = Op.getOperand(1);
+  MVT ValVT = Val.getSimpleValueType();
   SDValue Idx = Op.getOperand(2);
 
   if (VecVT.getVectorElementType() == MVT::i1) {
@@ -8769,6 +8771,17 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
     return DAG.getNode(ISD::TRUNCATE, DL, VecVT, Vec);
   }
 
+  if ((ValVT == MVT::f16 && !Subtarget.hasVInstructionsF16()) ||
+      ValVT == MVT::bf16) {
+    // If we don't have vfmv.s.f for f16/bf16, insert into fmv.x.h first
+    MVT IntVT = VecVT.changeTypeToInteger();
+    // SDValue IntVal = DAG.getBitcast(IntVT.getVectorElementType(), Val);
+    SDValue IntInsert = DAG.getNode(
+        ISD::INSERT_VECTOR_ELT, DL, IntVT, DAG.getBitcast(IntVT, Vec),
+        DAG.getNode(RISCVISD::FMV_X_ANYEXTH, DL, XLenVT, Val), Idx);
+    return DAG.getBitcast(VecVT, IntInsert);
+  }
+
   MVT ContainerVT = VecVT;
   // If the operand is a fixed-length vector, convert to a scalable one.
   if (VecVT.isFixedLengthVector()) {
@@ -8812,8 +8825,6 @@ SDValue RISCVTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op,
                         AlignedIdx);
   }
 
-  MVT XLenVT = Subtarget.getXLenVT();
-
   bool IsLegalInsert = Subtarget.is64Bit() || Val.getValueType() != MVT::i64;
   // Even i64-element vectors on RV32 can be lowered without scalar
   // legalization if the most-significant 32 bits of the value are not affected
diff --git a/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll b/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
index 8cfa88e6f95697..607e0085c3f468 100644
--- a/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/insertelt-fp.ll
@@ -1,209 +1,585 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfh,+v -target-abi=ilp32d \
-; RUN:     -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfh,+v -target-abi=lp64d \
-; RUN:     -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zfbfmin,+zvfh,+zvfbfmin,+v -target-abi=ilp32d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFH
+; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zfbfmin,+zvfh,+zvfbfmin,+v -target-abi=lp64d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFH
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zfbfmin,+zvfhmin,+zvfbfmin,+v -target-abi=ilp32d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFHMIN
+; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zfbfmin,+zvfhmin,+zvfbfmin,+v -target-abi=lp64d \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,ZVFHMIN
 
-define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_0:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_0(<vscale x 1 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv1bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 0
-  ret <vscale x 1 x half> %r
+  %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 1 x bfloat> %r
 }
 
-define <vscale x 1 x half> @insertelt_nxv1f16_imm(<vscale x 1 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv1f16_imm:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_imm(<vscale x 1 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv1bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, mf4, tu, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    vmv.s.x v9, a0
 ; CHECK-NEXT:    vslideup.vi v8, v9, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 3
-  ret <vscale x 1 x half> %r
+  %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 1 x bfloat> %r
 }
 
-define <vscale x 1 x half> @insertelt_nxv1f16_idx(<vscale x 1 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv1f16_idx:
+define <vscale x 1 x bfloat> @insertelt_nxv1bf16_idx(<vscale x 1 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv1bf16_idx:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    addi a1, a0, 1
-; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    fmv.x.h a2, fa0
+; CHECK-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v9, a2
 ; CHECK-NEXT:    vsetvli zero, a1, e16, mf4, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v9, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 %idx
-  ret <vscale x 1 x half> %r
+  %r = insertelement <vscale x 1 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 1 x bfloat> %r
 }
 
-define <vscale x 2 x half> @insertelt_nxv2f16_0(<vscale x 2 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv2f16_0:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_0(<vscale x 2 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv2bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 0
-  ret <vscale x 2 x half> %r
+  %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 2 x bfloat> %r
 }
 
-define <vscale x 2 x half> @insertelt_nxv2f16_imm(<vscale x 2 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv2f16_imm:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_imm(<vscale x 2 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv2bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, mf2, tu, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    vmv.s.x v9, a0
 ; CHECK-NEXT:    vslideup.vi v8, v9, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 3
-  ret <vscale x 2 x half> %r
+  %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 2 x bfloat> %r
 }
 
-define <vscale x 2 x half> @insertelt_nxv2f16_idx(<vscale x 2 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv2f16_idx:
+define <vscale x 2 x bfloat> @insertelt_nxv2bf16_idx(<vscale x 2 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv2bf16_idx:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    addi a1, a0, 1
-; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    fmv.x.h a2, fa0
+; CHECK-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v9, a2
 ; CHECK-NEXT:    vsetvli zero, a1, e16, mf2, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v9, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 %idx
-  ret <vscale x 2 x half> %r
+  %r = insertelement <vscale x 2 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 2 x bfloat> %r
 }
 
-define <vscale x 4 x half> @insertelt_nxv4f16_0(<vscale x 4 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv4f16_0:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_0(<vscale x 4 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv4bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 4 x half> %v, half %elt, i32 0
-  ret <vscale x 4 x half> %r
+  %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 4 x bfloat> %r
 }
 
-define <vscale x 4 x half> @insertelt_nxv4f16_imm(<vscale x 4 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv4f16_imm:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_imm(<vscale x 4 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv4bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    vmv.s.x v9, a0
 ; CHECK-NEXT:    vslideup.vi v8, v9, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 4 x half> %v, half %elt, i32 3
-  ret <vscale x 4 x half> %r
+  %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 4 x bfloat> %r
 }
 
-define <vscale x 4 x half> @insertelt_nxv4f16_idx(<vscale x 4 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv4f16_idx:
+define <vscale x 4 x bfloat> @insertelt_nxv4bf16_idx(<vscale x 4 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv4bf16_idx:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    addi a1, a0, 1
-; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v9, fa0
+; CHECK-NEXT:    fmv.x.h a2, fa0
+; CHECK-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v9, a2
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m1, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v9, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 4 x half> %v, half %elt, i32 %idx
-  ret <vscale x 4 x half> %r
+  %r = insertelement <vscale x 4 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 4 x bfloat> %r
 }
 
-define <vscale x 8 x half> @insertelt_nxv8f16_0(<vscale x 8 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv8f16_0:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_0(<vscale x 8 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv8bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 8 x half> %v, half %elt, i32 0
-  ret <vscale x 8 x half> %r
+  %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 8 x bfloat> %r
 }
 
-define <vscale x 8 x half> @insertelt_nxv8f16_imm(<vscale x 8 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv8f16_imm:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_imm(<vscale x 8 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv8bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v10, fa0
+; CHECK-NEXT:    vmv.s.x v10, a0
 ; CHECK-NEXT:    vslideup.vi v8, v10, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 8 x half> %v, half %elt, i32 3
-  ret <vscale x 8 x half> %r
+  %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 8 x bfloat> %r
 }
 
-define <vscale x 8 x half> @insertelt_nxv8f16_idx(<vscale x 8 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv8f16_idx:
+define <vscale x 8 x bfloat> @insertelt_nxv8bf16_idx(<vscale x 8 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv8bf16_idx:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v10, fa0
+; CHECK-NEXT:    fmv.x.h a1, fa0
+; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v10, a1
 ; CHECK-NEXT:    addi a1, a0, 1
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m2, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v10, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 8 x half> %v, half %elt, i32 %idx
-  ret <vscale x 8 x half> %r
+  %r = insertelement <vscale x 8 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 8 x bfloat> %r
 }
 
-define <vscale x 16 x half> @insertelt_nxv16f16_0(<vscale x 16 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv16f16_0:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_0(<vscale x 16 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv16bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 16 x half> %v, half %elt, i32 0
-  ret <vscale x 16 x half> %r
+  %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 16 x bfloat> %r
 }
 
-define <vscale x 16 x half> @insertelt_nxv16f16_imm(<vscale x 16 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv16f16_imm:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_imm(<vscale x 16 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv16bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v12, fa0
+; CHECK-NEXT:    vmv.s.x v12, a0
 ; CHECK-NEXT:    vslideup.vi v8, v12, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 16 x half> %v, half %elt, i32 3
-  ret <vscale x 16 x half> %r
+  %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 16 x bfloat> %r
 }
 
-define <vscale x 16 x half> @insertelt_nxv16f16_idx(<vscale x 16 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv16f16_idx:
+define <vscale x 16 x bfloat> @insertelt_nxv16bf16_idx(<vscale x 16 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv16bf16_idx:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v12, fa0
+; CHECK-NEXT:    fmv.x.h a1, fa0
+; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v12, a1
 ; CHECK-NEXT:    addi a1, a0, 1
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m4, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v12, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 16 x half> %v, half %elt, i32 %idx
-  ret <vscale x 16 x half> %r
+  %r = insertelement <vscale x 16 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 16 x bfloat> %r
 }
 
-define <vscale x 32 x half> @insertelt_nxv32f16_0(<vscale x 32 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv32f16_0:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_0(<vscale x 32 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv32bf16_0:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    fmv.x.h a0, fa0
+; CHECK-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; CHECK-NEXT:    vmv.s.x v8, a0
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 32 x half> %v, half %elt, i32 0
-  ret <vscale x 32 x half> %r
+  %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 0
+  ret <vscale x 32 x bfloat> %r
 }
 
-define <vscale x 32 x half> @insertelt_nxv32f16_imm(<vscale x 32 x half> %v, half %elt) {
-; CHECK-LABEL: insertelt_nxv32f16_imm:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_imm(<vscale x 32 x bfloat> %v, bfloat %elt) {
+; CHECK-LABEL: insertelt_nxv32bf16_imm:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    fmv.x.h a0, fa0
 ; CHECK-NEXT:    vsetivli zero, 4, e16, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v16, fa0
+; CHECK-NEXT:    vmv.s.x v16, a0
 ; CHECK-NEXT:    vslideup.vi v8, v16, 3
 ; CHECK-NEXT:    ret
-  %r = insertelement <vscale x 32 x half> %v, half %elt, i32 3
-  ret <vscale x 32 x half> %r
+  %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 3
+  ret <vscale x 32 x bfloat> %r
 }
 
-define <vscale x 32 x half> @insertelt_nxv32f16_idx(<vscale x 32 x half> %v, half %elt, i32 zeroext %idx) {
-; CHECK-LABEL: insertelt_nxv32f16_idx:
+define <vscale x 32 x bfloat> @insertelt_nxv32bf16_idx(<vscale x 32 x bfloat> %v, bfloat %elt, i32 zeroext %idx) {
+; CHECK-LABEL: insertelt_nxv32bf16_idx:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.s.f v16, fa0
+; CHECK-NEXT:    fmv.x.h a1, fa0
+; CHECK-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; CHECK-NEXT:    vmv.s.x v16, a1
 ; CHECK-NEXT:    addi a1, a0, 1
 ; CHECK-NEXT:    vsetvli zero, a1, e16, m8, tu, ma
 ; CHECK-NEXT:    vslideup.vx v8, v16, a0
 ; CHECK-NEXT:    ret
+  %r = insertelement <vscale x 32 x bfloat> %v, bfloat %elt, i32 %idx
+  ret <vscale x 32 x bfloat> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv1f16_0:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
+; ZVFH-NEXT:    vfmv.s.f v8, fa0
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_0:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    fmv.x.h a0, fa0
+; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; ZVFHMIN-NEXT:    vmv.s.x v8, a0
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 0
+  ret <vscale x 1 x half> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_imm(<vscale x 1 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv1f16_imm:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 4, e16, mf4, tu, ma
+; ZVFH-NEXT:    vfmv.s.f v9, fa0
+; ZVFH-NEXT:    vslideup.vi v8, v9, 3
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_imm:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    fmv.x.h a0, fa0
+; ZVFHMIN-NEXT:    vsetivli zero, 4, e16, mf4, tu, ma
+; ZVFHMIN-NEXT:    vmv.s.x v9, a0
+; ZVFHMIN-NEXT:    vslideup.vi v8, v9, 3
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 3
+  ret <vscale x 1 x half> %r
+}
+
+define <vscale x 1 x half> @insertelt_nxv1f16_idx(<vscale x 1 x half> %v, half %elt, i32 zeroext %idx) {
+; ZVFH-LABEL: insertelt_nxv1f16_idx:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    addi a1, a0, 1
+; ZVFH-NEXT:    vsetvli a2, zero, e16, m1, ta, ma
+; ZVFH-NEXT:    vfmv.s.f v9, fa0
+; ZVFH-NEXT:    vsetvli zero, a1, e16, mf4, tu, ma
+; ZVFH-NEXT:    vslideup.vx v8, v9, a0
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv1f16_idx:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    addi a1, a0, 1
+; ZVFHMIN-NEXT:    fmv.x.h a2, fa0
+; ZVFHMIN-NEXT:    vsetvli a3, zero, e16, m1, ta, ma
+; ZVFHMIN-NEXT:    vmv.s.x v9, a2
+; ZVFHMIN-NEXT:    vsetvli zero, a1, e16, mf4, tu, ma
+; ZVFHMIN-NEXT:    vslideup.vx v8, v9, a0
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 1 x half> %v, half %elt, i32 %idx
+  ret <vscale x 1 x half> %r
+}
+
+define <vscale x 2 x half> @insertelt_nxv2f16_0(<vscale x 2 x half> %v, half %elt) {
+; ZVFH-LABEL: insertelt_nxv2f16_0:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetvli a0, zero, e16, m1, tu, ma
+; ZVFH-NEXT:    vfmv.s.f v8, fa0
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: insertelt_nxv2f16_0:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    fmv.x.h a0, fa0
+; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m1, tu, ma
+; ZVFHMIN-NEXT:    vmv.s.x v8, a0
+; ZVFHMIN-NEXT:    ret
+  %r = insertelement <vscale x 2 x half> %v, half %elt, i32 0
+  ret <vscale x 2 x half> %r
+}
+
+define <vscale x 2 x half> @insertelt_nxv2f16_imm(<vscale x 2 x half> %v, half %elt) {
+; ZVFH-LABEL: ...
[truncated]

ValVT == MVT::bf16) {
// If we don't have vfmv.s.f for f16/bf16, insert into fmv.x.h first
MVT IntVT = VecVT.changeTypeToInteger();
// SDValue IntVal = DAG.getBitcast(IntVT.getVectorElementType(), Val);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, removed

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/minor comment.


define <vscale x 1 x half> @insertelt_nxv1f16_0(<vscale x 1 x half> %v, half %elt) {
; CHECK-LABEL: insertelt_nxv1f16_0:
define <vscale x 1 x bfloat> @insertelt_nxv1bf16_0(<vscale x 1 x bfloat> %v, bfloat %elt) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test diffs are confusing, can you put the bfloat tests after the half tests just to reduce the delta?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved it but unfortunately the test diffs are somewhat confusing. I think the diff is getting confused because bfloat and half share most of the same check lines.

This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.
@lukel97 lukel97 force-pushed the zvfhmin-zvfbfmin/insert_vector_elt branch from cdeed50 to b4716f5 Compare October 2, 2024 06:36
@lukel97 lukel97 merged commit 1fa4a74 into llvm:main Oct 2, 2024
5 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 2, 2024

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu running on linaro-lldb-arm-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/4793

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/namespace_conflicts/TestNamespaceConflicts.py (804 of 2813)
PASS: lldb-api :: lang/cpp/namespace/TestNamespace.py (805 of 2813)
PASS: lldb-api :: lang/cpp/namespace_definitions/TestNamespaceDefinitions.py (806 of 2813)
PASS: lldb-api :: lang/cpp/nested-class-other-compilation-unit/TestNestedClassWithParentInAnotherCU.py (807 of 2813)
PASS: lldb-api :: lang/cpp/nested-template/TestNestedTemplate.py (808 of 2813)
PASS: lldb-api :: lang/cpp/non-type-template-param/TestCppNonTypeTemplateParam.py (809 of 2813)
PASS: lldb-api :: lang/cpp/no_unique_address/TestNoUniqueAddress.py (810 of 2813)
PASS: lldb-api :: lang/cpp/nsimport/TestCppNsImport.py (811 of 2813)
PASS: lldb-api :: lang/cpp/offsetof/TestOffsetofCpp.py (812 of 2813)
PASS: lldb-api :: lang/cpp/operator-overload/TestOperatorOverload.py (813 of 2813)
FAIL: lldb-api :: lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py (814 of 2813)
******************** TEST 'lldb-api :: lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env ARCHIVER=/usr/local/bin/llvm-ar --env OBJCOPY=/usr/bin/llvm-objcopy --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/lang/c/shared_lib_stripped_symbols -p TestSharedLibStrippedSymbols.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 1fa4a74d53184df0ea3dcb7eb6faf66d6974e7bb)
  clang revision 1fa4a74d53184df0ea3dcb7eb6faf66d6974e7bb
  llvm revision 1fa4a74d53184df0ea3dcb7eb6faf66d6974e7bb
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dsym (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dsym (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase) (test case does not fall in any category of interest for this run) 
XFAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
XFAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
======================================================================
FAIL: test_expr_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
   Test that types work when defined in a shared library and forwa/d-declared in the main executable
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1769, in test_method
    return attrvalue(self)
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py", line 24, in test_expr
    self.expect(
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2370, in expect
    self.runCmd(
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1000, in runCmd
    self.assertTrue(self.res.Succeeded(), msg + output)
AssertionError: False is not true : Variable(s) displayed correctly
Error output:

Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
This is the dual of llvm#110144, but doesn't handle the case when the scalar
type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't
yet implemented for insert_vector_elt operands and it will crash during
type legalization, so I've left that configuration out of the tests.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Nov 5, 2024
…bfmin

RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles f16 and bf16 scalable vectors after llvm#110221, so we can reuse it for fixed-length vectors.
lukel97 added a commit that referenced this pull request Nov 5, 2024
…bfmin (#114927)

RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles
f16 and bf16 scalable vectors after #110221, so we can reuse it for
fixed-length vectors.
PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024
…bfmin (llvm#114927)

RISCVTargetLowering::lower{INSERT,EXTRACT}_VECTOR_ELT already handles
f16 and bf16 scalable vectors after llvm#110221, so we can reuse it for
fixed-length vectors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants