Skip to content

[RISCV] Handle f16/bf16 extract_vector_elt when scalar type is legal #110144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Sep 26, 2024

When the scalar type is illegal, it gets softened during type legalization and gets lowered as an integer.

However with zfhmin/zfbfmin the type is now legal and it passes through type legalization where it crashes because we didn't have any custom lowering or patterns for it.

This handles said case via the existing custom lowering to a vslidedown and vfmv.f.s.
It also handles the case where we only have zvfhmin/zvfbfmin and don't have vfmv.f.s, in which case we need to extract it to a GPR and then use fmv.h.x.

Fixes #110126

When the scalar type is illegal, it gets softened during type legalization and gets lowered as an integer.

However with zfhmin/zfbfmin the type is now legal and it passes through type legalization where it crashes because we didn't have any custom lowering or patterns for it.

This handles said case via the existing custom lowering to a vslidedown and vfmv.f.s.
It also handles the case where we only have zvfhmin/zvfbfmin and don't have vfmv.f.s, in which case we need to extract it to a GPR and then use fmv.h.x.

Fixes llvm#110126
@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

When the scalar type is illegal, it gets softened during type legalization and gets lowered as an integer.

However with zfhmin/zfbfmin the type is now legal and it passes through type legalization where it crashes because we didn't have any custom lowering or patterns for it.

This handles said case via the existing custom lowering to a vslidedown and vfmv.f.s.
It also handles the case where we only have zvfhmin/zvfbfmin and don't have vfmv.f.s, in which case we need to extract it to a GPR and then use fmv.h.x.

Fixes #110126


Patch is 36.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110144.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+13-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll (+810-112)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 7a19a879ca3420..d52b802bdd52be 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1082,8 +1082,9 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
                          VT, Custom);
       MVT EltVT = VT.getVectorElementType();
       if (isTypeLegal(EltVT))
-        setOperationAction({ISD::SPLAT_VECTOR, ISD::EXPERIMENTAL_VP_SPLAT}, VT,
-                           Custom);
+        setOperationAction({ISD::SPLAT_VECTOR, ISD::EXPERIMENTAL_VP_SPLAT,
+                            ISD::EXTRACT_VECTOR_ELT},
+                           VT, Custom);
       else
         setOperationAction({ISD::SPLAT_VECTOR, ISD::EXPERIMENTAL_VP_SPLAT},
                            EltVT, Custom);
@@ -8990,6 +8991,16 @@ SDValue RISCVTargetLowering::lowerEXTRACT_VECTOR_ELT(SDValue Op,
     return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, EltVT, Vec, Idx);
   }
 
+  if ((EltVT == MVT::f16 && !Subtarget.hasVInstructionsF16()) ||
+      EltVT == MVT::bf16) {
+    // If we don't have vfmv.f.s for f16/bf16, extract to a gpr then use fmv.h.x
+    MVT IntVT = VecVT.changeTypeToInteger();
+    SDValue IntVec = DAG.getBitcast(IntVT, Vec);
+    SDValue IntExtract =
+        DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, XLenVT, IntVec, Idx);
+    return DAG.getNode(RISCVISD::FMV_H_X, DL, EltVT, IntExtract);
+  }
+
   // If this is a fixed vector, we need to convert it to a scalable vector.
   MVT ContainerVT = VecVT;
   if (VecVT.isFixedLengthVector()) {
diff --git a/llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll b/llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll
index 209a37bf66ae34..86ef78be97afb0 100644
--- a/llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/extractelt-fp.ll
@@ -1,197 +1,895 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfh,+v -target-abi=ilp32d \
-; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32
-; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfh,+v -target-abi=lp64d \
-; RUN:     -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64
+; RUN: llc -mtriple=riscv32 -mattr=+v,+d,+zvfh,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,NOZFMIN,ZVFH
+; RUN: llc -mtriple=riscv64 -mattr=+v,+d,+zvfh,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,NOZFMIN,ZVFH
+; RUN: llc -mtriple=riscv32 -mattr=+v,+d,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,NOZFMIN,ZVFHMIN
+; RUN: llc -mtriple=riscv64 -mattr=+v,+d,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,NOZFMIN,ZVFHMIN
+; RUN: llc -mtriple=riscv32 -mattr=+v,+d,+zfhmin,+zfbfmin,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV32,ZFMIN
+; RUN: llc -mtriple=riscv64 -mattr=+v,+d,+zfhmin,+zfbfmin,+zvfhmin,+zvfbfmin -verify-machineinstrs < %s | FileCheck %s --check-prefixes=CHECK,RV64,ZFMIN
+
+define bfloat @extractelt_nxv1bf16_0(<vscale x 1 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv1bf16_0:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv1bf16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 1 x bfloat> %v, i32 0
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv1bf16_imm(<vscale x 1 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv1bf16_imm:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv1bf16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 1 x bfloat> %v, i32 2
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv1bf16_idx(<vscale x 1 x bfloat> %v, i32 zeroext %idx) {
+; NOZFMIN-LABEL: extractelt_nxv1bf16_idx:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv1bf16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 1 x bfloat> %v, i32 %idx
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv2bf16_0(<vscale x 2 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv2bf16_0:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv2bf16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 2 x bfloat> %v, i32 0
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv2bf16_imm(<vscale x 2 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv2bf16_imm:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, mf2, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv2bf16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, mf2, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 2 x bfloat> %v, i32 2
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv2bf16_idx(<vscale x 2 x bfloat> %v, i32 zeroext %idx) {
+; NOZFMIN-LABEL: extractelt_nxv2bf16_idx:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, mf2, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv2bf16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, mf2, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 2 x bfloat> %v, i32 %idx
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv4bf16_0(<vscale x 4 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv4bf16_0:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv4bf16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 4 x bfloat> %v, i32 0
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv4bf16_imm(<vscale x 4 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv4bf16_imm:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv4bf16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 4 x bfloat> %v, i32 2
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv4bf16_idx(<vscale x 4 x bfloat> %v, i32 zeroext %idx) {
+; NOZFMIN-LABEL: extractelt_nxv4bf16_idx:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv4bf16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 4 x bfloat> %v, i32 %idx
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv8bf16_0(<vscale x 8 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv8bf16_0:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv8bf16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 8 x bfloat> %v, i32 0
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv8bf16_imm(<vscale x 8 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv8bf16_imm:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv8bf16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 8 x bfloat> %v, i32 2
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv8bf16_idx(<vscale x 8 x bfloat> %v, i32 zeroext %idx) {
+; NOZFMIN-LABEL: extractelt_nxv8bf16_idx:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m2, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv8bf16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m2, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 8 x bfloat> %v, i32 %idx
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv16bf16_0(<vscale x 16 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv16bf16_0:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv16bf16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 16 x bfloat> %v, i32 0
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv16bf16_imm(<vscale x 16 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv16bf16_imm:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv16bf16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 16 x bfloat> %v, i32 2
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv16bf16_idx(<vscale x 16 x bfloat> %v, i32 zeroext %idx) {
+; NOZFMIN-LABEL: extractelt_nxv16bf16_idx:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m4, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv16bf16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m4, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 16 x bfloat> %v, i32 %idx
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv32bf16_0(<vscale x 32 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv32bf16_0:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv32bf16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 32 x bfloat> %v, i32 0
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv32bf16_imm(<vscale x 32 x bfloat> %v) {
+; NOZFMIN-LABEL: extractelt_nxv32bf16_imm:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv32bf16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 32 x bfloat> %v, i32 2
+  ret bfloat %r
+}
+
+define bfloat @extractelt_nxv32bf16_idx(<vscale x 32 x bfloat> %v, i32 zeroext %idx) {
+; NOZFMIN-LABEL: extractelt_nxv32bf16_idx:
+; NOZFMIN:       # %bb.0:
+; NOZFMIN-NEXT:    vsetivli zero, 1, e16, m8, ta, ma
+; NOZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; NOZFMIN-NEXT:    vmv.x.s a0, v8
+; NOZFMIN-NEXT:    lui a1, 1048560
+; NOZFMIN-NEXT:    or a0, a0, a1
+; NOZFMIN-NEXT:    fmv.w.x fa0, a0
+; NOZFMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv32bf16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m8, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
+  %r = extractelement <vscale x 32 x bfloat> %v, i32 %idx
+  ret bfloat %r
+}
 
 define half @extractelt_nxv1f16_0(<vscale x 1 x half> %v) {
-; CHECK-LABEL: extractelt_nxv1f16_0:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.f.s fa0, v8
-; CHECK-NEXT:    ret
+; ZVFH-LABEL: extractelt_nxv1f16_0:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZVFH-NEXT:    vfmv.f.s fa0, v8
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: extractelt_nxv1f16_0:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZVFHMIN-NEXT:    vmv.x.s a0, v8
+; ZVFHMIN-NEXT:    lui a1, 1048560
+; ZVFHMIN-NEXT:    or a0, a0, a1
+; ZVFHMIN-NEXT:    fmv.w.x fa0, a0
+; ZVFHMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv1f16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
   %r = extractelement <vscale x 1 x half> %v, i32 0
   ret half %r
 }
 
 define half @extractelt_nxv1f16_imm(<vscale x 1 x half> %v) {
-; CHECK-LABEL: extractelt_nxv1f16_imm:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
-; CHECK-NEXT:    vslidedown.vi v8, v8, 2
-; CHECK-NEXT:    vfmv.f.s fa0, v8
-; CHECK-NEXT:    ret
+; ZVFH-LABEL: extractelt_nxv1f16_imm:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZVFH-NEXT:    vslidedown.vi v8, v8, 2
+; ZVFH-NEXT:    vfmv.f.s fa0, v8
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: extractelt_nxv1f16_imm:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZVFHMIN-NEXT:    vmv.x.s a0, v8
+; ZVFHMIN-NEXT:    lui a1, 1048560
+; ZVFHMIN-NEXT:    or a0, a0, a1
+; ZVFHMIN-NEXT:    fmv.w.x fa0, a0
+; ZVFHMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv1f16_imm:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZFMIN-NEXT:    vslidedown.vi v8, v8, 2
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
   %r = extractelement <vscale x 1 x half> %v, i32 2
   ret half %r
 }
 
 define half @extractelt_nxv1f16_idx(<vscale x 1 x half> %v, i32 zeroext %idx) {
-; CHECK-LABEL: extractelt_nxv1f16_idx:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
-; CHECK-NEXT:    vslidedown.vx v8, v8, a0
-; CHECK-NEXT:    vfmv.f.s fa0, v8
-; CHECK-NEXT:    ret
+; ZVFH-LABEL: extractelt_nxv1f16_idx:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZVFH-NEXT:    vslidedown.vx v8, v8, a0
+; ZVFH-NEXT:    vfmv.f.s fa0, v8
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: extractelt_nxv1f16_idx:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZVFHMIN-NEXT:    vmv.x.s a0, v8
+; ZVFHMIN-NEXT:    lui a1, 1048560
+; ZVFHMIN-NEXT:    or a0, a0, a1
+; ZVFHMIN-NEXT:    fmv.w.x fa0, a0
+; ZVFHMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv1f16_idx:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, mf4, ta, ma
+; ZFMIN-NEXT:    vslidedown.vx v8, v8, a0
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
   %r = extractelement <vscale x 1 x half> %v, i32 %idx
   ret half %r
 }
 
 define half @extractelt_nxv2f16_0(<vscale x 2 x half> %v) {
-; CHECK-LABEL: extractelt_nxv2f16_0:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
-; CHECK-NEXT:    vfmv.f.s fa0, v8
-; CHECK-NEXT:    ret
+; ZVFH-LABEL: extractelt_nxv2f16_0:
+; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZVFH-NEXT:    vfmv.f.s fa0, v8
+; ZVFH-NEXT:    ret
+;
+; ZVFHMIN-LABEL: extractelt_nxv2f16_0:
+; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZVFHMIN-NEXT:    vmv.x.s a0, v8
+; ZVFHMIN-NEXT:    lui a1, 1048560
+; ZVFHMIN-NEXT:    or a0, a0, a1
+; ZVFHMIN-NEXT:    fmv.w.x fa0, a0
+; ZVFHMIN-NEXT:    ret
+;
+; ZFMIN-LABEL: extractelt_nxv2f16_0:
+; ZFMIN:       # %bb.0:
+; ZFMIN-NEXT:    vsetivli zero, 1, e16, m1, ta, ma
+; ZFMIN-NEXT:    vmv.x.s a0, v8
+; ZFMIN-NEXT:    fmv.h.x fa0, a0
+; ZFMIN-NEXT:    ret
   %r = extractelement <vscale x 2 x half> %v, i32 0
   ret half %r
 }
 
 define half @extractelt_nxv2f16_imm(<vscale x 2 x half> %v) {
-; CHECK-LABEL: extractelt_nxv2f16_imm:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 1, e16, mf2, ta, ma
-; CHECK-NEXT:    vslidedown.vi v8, v8, 2
-; CHECK-NEXT:    vfmv.f.s fa0, v8
-; CHECK-NEXT:  ...
[truncated]

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97 lukel97 merged commit 2b84ef0 into llvm:main Sep 27, 2024
10 checks passed
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Sep 27, 2024
After llvm#110144, we can finish off llvm#110129 and fold f16 vfmv.f.s into a flh.
vfmv.f.s is only available for f16 with zvfh, which in turn requires zfhmin so we can use flh.

bf16 has no vfmv.f.s so the extract_vector_elt is lowered as an integer in llvm#110144, and gets the existing integer vmv.x.s fold.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Sep 27, 2024
This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Sep 27, 2024
…lvm#110144)

When the scalar type is illegal, it gets softened during type
legalization and gets lowered as an integer.

However with zfhmin/zfbfmin the type is now legal and it passes through
type legalization where it crashes because we didn't have any custom
lowering or patterns for it.

This handles said case via the existing custom lowering to a vslidedown
and vfmv.f.s.
It also handles the case where we only have zvfhmin/zvfbfmin and don't
have vfmv.f.s, in which case we need to extract it to a GPR and then use
fmv.h.x.

Fixes llvm#110126
lukel97 added a commit that referenced this pull request Oct 1, 2024
After #110144, we can finish off #110129 and fold f16 vfmv.f.s into a
flh.
vfmv.f.s is only available for f16 with zvfh, which in turn requires
zfhmin so we can use flh.

bf16 has no vfmv.f.s so the extract_vector_elt is lowered as an integer
in #110144, and gets the existing integer vmv.x.s fold.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Oct 2, 2024
This is the dual of llvm#110144, but doesn't handle the case when the scalar type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't yet implemented for insert_vector_elt operands and it will crash during type legalization, so I've left that configuration out of the tests.
lukel97 added a commit that referenced this pull request Oct 2, 2024
This is the dual of #110144, but doesn't handle the case when the scalar
type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't
yet implemented for insert_vector_elt operands and it will crash during
type legalization, so I've left that configuration out of the tests.
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
After llvm#110144, we can finish off llvm#110129 and fold f16 vfmv.f.s into a
flh.
vfmv.f.s is only available for f16 with zvfh, which in turn requires
zfhmin so we can use flh.

bf16 has no vfmv.f.s so the extract_vector_elt is lowered as an integer
in llvm#110144, and gets the existing integer vmv.x.s fold.
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
This is the dual of llvm#110144, but doesn't handle the case when the scalar
type is illegal i.e. no zfhmin/zfbfmin. It looks like softening isn't
yet implemented for insert_vector_elt operands and it will crash during
type legalization, so I've left that configuration out of the tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Can't select f16 extract_vector_elt with zfhmin
4 participants