-
Notifications
You must be signed in to change notification settings - Fork 15k
[VP][RISCV] Add vp.reduce.fmaximum/fminimum and its RISC-V codegen #91782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
`vp.reduce.fmaximum/fminimum` are the VP version of `vector.reduce.fmaximum/fminimum`.
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-ir Author: Min-Yih Hsu (mshockwave) Changes
Patch is 42.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91782.diff 13 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 6f5a4644ffc2b..562841101b75a 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -22182,6 +22182,146 @@ Examples:
%also.r = call float @llvm.minnum.f32(float %reduction, float %start)
+.. _int_vp_reduce_fmaximum:
+
+'``llvm.vp.reduce.fmaximum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+ declare float @llvm.vp.reduce.fmaximum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+ declare double @llvm.vp.reduce.fmaximum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fmaximum``' intrinsic performs the floating-point ``MAX``
+reduction (:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>`) of
+the vector operand ``val`` on each enabled lane, taking the maximum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``.
+If ``ninf`` is set, then the neutral value is the smallest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>` intrinsic (and
+thus the '``llvm.maximum.*``' intrinsic). That is, the result will always be a
+number unless any of the elements in the vector or the starting value is
+``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered
+less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+ %r = call float @llvm.vp.reduce.fmaximum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+ ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+ ; are treated as though %mask were false for those lanes.
+
+ %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+ %reduction = call float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %masked.a)
+ %also.r = call float @llvm.maximum.f32(float %reduction, float %start)
+
+
+.. _int_vp_reduce_fmin:
+
+'``llvm.vp.reduce.fminimum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+ declare float @llvm.vp.reduce.fminimum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+ declare double @llvm.vp.reduce.fminimum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fminimum``' intrinsic performs the floating-point ``MIN``
+reduction (:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>`) of
+the vector operand ``val`` on each enabled lane, taking the minimum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the neutral
+value (i.e. having no effect on the reduction operation). If the vector length
+is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``.
+If ``ninf`` is set, then the neutral value is the largest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>` intrinsic (and
+thus the '``llvm.minimum.*``' intrinsic). That is, the result will always be a
+number unless any of the elements in the vector or the starting value is
+``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered
+less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+ %r = call float @llvm.vp.reduce.fminimum.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+ ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+ ; are treated as though %mask were false for those lanes.
+
+ %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float infinity, float infinity, float infinity, float infinity>
+ %reduction = call float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %masked.a)
+ %also.r = call float @llvm.minimum.f32(float %reduction, float %start)
+
+
.. _int_get_active_lane_mask:
'``llvm.get.active.lane.mask.*``' Intrinsics
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 29143123193b9..42192d472ba6e 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -2243,6 +2243,16 @@ let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn] in {
llvm_anyvector_ty,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
+ def int_vp_reduce_fmaximum : DefaultAttrsIntrinsic<[LLVMVectorElementType<0>],
+ [ LLVMVectorElementType<0>,
+ llvm_anyvector_ty,
+ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
+ llvm_i32_ty]>;
+ def int_vp_reduce_fminimum : DefaultAttrsIntrinsic<[LLVMVectorElementType<0>],
+ [ LLVMVectorElementType<0>,
+ llvm_anyvector_ty,
+ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
+ llvm_i32_ty]>;
}
let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn, ImmArg<ArgIndex<1>>] in {
diff --git a/llvm/include/llvm/IR/VPIntrinsics.def b/llvm/include/llvm/IR/VPIntrinsics.def
index f1cc8bcae467b..20f5bb2b531d3 100644
--- a/llvm/include/llvm/IR/VPIntrinsics.def
+++ b/llvm/include/llvm/IR/VPIntrinsics.def
@@ -701,6 +701,14 @@ HELPER_REGISTER_REDUCTION_VP(vp_reduce_fmax, VP_REDUCE_FMAX,
HELPER_REGISTER_REDUCTION_VP(vp_reduce_fmin, VP_REDUCE_FMIN,
vector_reduce_fmin)
+// llvm.vp.reduce.fmaximum(start,x,mask,vlen)
+HELPER_REGISTER_REDUCTION_VP(vp_reduce_fmaximum, VP_REDUCE_FMAXIMUM,
+ vector_reduce_fmaximum)
+
+// llvm.vp.reduce.fminimum(start,x,mask,vlen)
+HELPER_REGISTER_REDUCTION_VP(vp_reduce_fminimum, VP_REDUCE_FMINIMUM,
+ vector_reduce_fminimum)
+
#undef HELPER_REGISTER_REDUCTION_VP
// Specialized helper macro for VP reductions as above but with two forms:
diff --git a/llvm/lib/CodeGen/ExpandVectorPredication.cpp b/llvm/lib/CodeGen/ExpandVectorPredication.cpp
index 8e623c85b737b..dc35f33a3a059 100644
--- a/llvm/lib/CodeGen/ExpandVectorPredication.cpp
+++ b/llvm/lib/CodeGen/ExpandVectorPredication.cpp
@@ -367,7 +367,8 @@ static Value *getNeutralReductionElement(const VPReductionIntrinsic &VPI,
Type *EltTy) {
bool Negative = false;
unsigned EltBits = EltTy->getScalarSizeInBits();
- switch (VPI.getIntrinsicID()) {
+ Intrinsic::ID VID = VPI.getIntrinsicID();
+ switch (VID) {
default:
llvm_unreachable("Expecting a VP reduction intrinsic");
case Intrinsic::vp_reduce_add:
@@ -387,12 +388,17 @@ static Value *getNeutralReductionElement(const VPReductionIntrinsic &VPI,
return ConstantInt::get(EltTy->getContext(),
APInt::getSignedMinValue(EltBits));
case Intrinsic::vp_reduce_fmax:
+ case Intrinsic::vp_reduce_fmaximum:
Negative = true;
[[fallthrough]];
- case Intrinsic::vp_reduce_fmin: {
+ case Intrinsic::vp_reduce_fmin:
+ case Intrinsic::vp_reduce_fminimum: {
+ bool PropagatesNaN = VID == Intrinsic::vp_reduce_fminimum ||
+ VID == Intrinsic::vp_reduce_fmaximum;
FastMathFlags Flags = VPI.getFastMathFlags();
const fltSemantics &Semantics = EltTy->getFltSemantics();
- return !Flags.noNaNs() ? ConstantFP::getQNaN(EltTy, Negative)
+ return (!Flags.noNaNs() && !PropagatesNaN)
+ ? ConstantFP::getQNaN(EltTy, Negative)
: !Flags.noInfs()
? ConstantFP::getInfinity(EltTy, Negative)
: ConstantFP::get(EltTy,
@@ -480,6 +486,18 @@ CachingVPExpander::expandPredicationInReduction(IRBuilder<> &Builder,
Reduction =
Builder.CreateBinaryIntrinsic(Intrinsic::minnum, Reduction, Start);
break;
+ case Intrinsic::vp_reduce_fmaximum:
+ Reduction = Builder.CreateFPMaximumReduce(RedOp);
+ transferDecorations(*Reduction, VPI);
+ Reduction =
+ Builder.CreateBinaryIntrinsic(Intrinsic::maximum, Reduction, Start);
+ break;
+ case Intrinsic::vp_reduce_fminimum:
+ Reduction = Builder.CreateFPMinimumReduce(RedOp);
+ transferDecorations(*Reduction, VPI);
+ Reduction =
+ Builder.CreateBinaryIntrinsic(Intrinsic::minimum, Reduction, Start);
+ break;
case Intrinsic::vp_reduce_fadd:
Reduction = Builder.CreateFAddReduce(Start, RedOp);
break;
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index b3ae419b20fec..fd97a1283b65a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -1222,6 +1222,8 @@ void SelectionDAGLegalize::LegalizeOp(SDNode *Node) {
case ISD::VP_REDUCE_UMIN:
case ISD::VP_REDUCE_FMAX:
case ISD::VP_REDUCE_FMIN:
+ case ISD::VP_REDUCE_FMAXIMUM:
+ case ISD::VP_REDUCE_FMINIMUM:
case ISD::VP_REDUCE_SEQ_FADD:
case ISD::VP_REDUCE_SEQ_FMUL:
Action = TLI.getOperationAction(
@@ -5015,6 +5017,8 @@ void SelectionDAGLegalize::PromoteNode(SDNode *Node) {
Node->getOpcode() == ISD::VP_REDUCE_FMUL ||
Node->getOpcode() == ISD::VP_REDUCE_FMAX ||
Node->getOpcode() == ISD::VP_REDUCE_FMIN ||
+ Node->getOpcode() == ISD::VP_REDUCE_FMAXIMUM ||
+ Node->getOpcode() == ISD::VP_REDUCE_FMINIMUM ||
Node->getOpcode() == ISD::VP_REDUCE_SEQ_FADD)
OVT = Node->getOperand(1).getSimpleValueType();
if (Node->getOpcode() == ISD::BR_CC ||
@@ -5687,6 +5691,8 @@ void SelectionDAGLegalize::PromoteNode(SDNode *Node) {
case ISD::VP_REDUCE_FMUL:
case ISD::VP_REDUCE_FMAX:
case ISD::VP_REDUCE_FMIN:
+ case ISD::VP_REDUCE_FMAXIMUM:
+ case ISD::VP_REDUCE_FMINIMUM:
case ISD::VP_REDUCE_SEQ_FADD:
Results.push_back(PromoteReduction(Node));
break;
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 43db9b8e6be9e..cd858003cf03b 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -3148,6 +3148,8 @@ bool DAGTypeLegalizer::SplitVectorOperand(SDNode *N, unsigned OpNo) {
case ISD::VP_REDUCE_UMIN:
case ISD::VP_REDUCE_FMAX:
case ISD::VP_REDUCE_FMIN:
+ case ISD::VP_REDUCE_FMAXIMUM:
+ case ISD::VP_REDUCE_FMINIMUM:
Res = SplitVecOp_VP_REDUCE(N, OpNo);
break;
case ISD::VP_CTTZ_ELTS:
@@ -6251,6 +6253,8 @@ bool DAGTypeLegalizer::WidenVectorOperand(SDNode *N, unsigned OpNo) {
case ISD::VP_REDUCE_UMIN:
case ISD::VP_REDUCE_FMAX:
case ISD::VP_REDUCE_FMIN:
+ case ISD::VP_REDUCE_FMAXIMUM:
+ case ISD::VP_REDUCE_FMINIMUM:
Res = WidenVecOp_VP_REDUCE(N);
break;
case ISD::VP_CTTZ_ELTS:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 9c1f3c1e34318..0a258350c68a5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -470,8 +470,10 @@ ISD::NodeType ISD::getVecReduceBaseOpcode(unsigned VecReduceOpcode) {
case ISD::VP_REDUCE_FMIN:
return ISD::FMINNUM;
case ISD::VECREDUCE_FMAXIMUM:
+ case ISD::VP_REDUCE_FMAXIMUM:
return ISD::FMAXIMUM;
case ISD::VECREDUCE_FMINIMUM:
+ case ISD::VP_REDUCE_FMINIMUM:
return ISD::FMINIMUM;
}
}
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index e0937989a6a41..ead14c6945025 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -713,7 +713,8 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_FRINT, ISD::VP_FNEARBYINT, ISD::VP_IS_FPCLASS,
ISD::VP_FMINIMUM, ISD::VP_FMAXIMUM, ISD::VP_LRINT,
ISD::VP_LLRINT, ISD::EXPERIMENTAL_VP_REVERSE,
- ISD::EXPERIMENTAL_VP_SPLICE};
+ ISD::EXPERIMENTAL_VP_SPLICE, ISD::VP_REDUCE_FMINIMUM,
+ ISD::VP_REDUCE_FMAXIMUM};
static const unsigned IntegerVecReduceOps[] = {
ISD::VECREDUCE_ADD, ISD::VECREDUCE_AND, ISD::VECREDUCE_OR,
@@ -958,7 +959,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_FFLOOR, ISD::VP_FROUND, ISD::VP_FROUNDEVEN,
ISD::VP_FCOPYSIGN, ISD::VP_FROUNDTOZERO, ISD::VP_FRINT,
ISD::VP_FNEARBYINT, ISD::VP_SETCC, ISD::VP_FMINIMUM,
- ISD::VP_FMAXIMUM};
+ ISD::VP_FMAXIMUM, ISD::VP_REDUCE_FMINIMUM, ISD::VP_REDUCE_FMAXIMUM};
// Sets common operation actions on RVV floating-point vector types.
const auto SetCommonVFPActions = [&](MVT VT) {
@@ -6661,6 +6662,8 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
case ISD::VP_REDUCE_SEQ_FADD:
case ISD::VP_REDUCE_FMIN:
case ISD::VP_REDUCE_FMAX:
+ case ISD::VP_REDUCE_FMINIMUM:
+ case ISD::VP_REDUCE_FMAXIMUM:
if (Op.getOperand(1).getValueType() == MVT::nxv32f16 &&
(Subtarget.hasVInstructionsF16Minimal() &&
!Subtarget.hasVInstructionsF16()))
@@ -9526,8 +9529,10 @@ static unsigned getRVVReductionOp(unsigned ISDOpcode) {
case ISD::VP_REDUCE_SEQ_FADD:
return RISCVISD::VECREDUCE_SEQ_FADD_VL;
case ISD::VP_REDUCE_FMAX:
+ case ISD::VP_REDUCE_FMAXIMUM:
return RISCVISD::VECREDUCE_FMAX_VL;
case ISD::VP_REDUCE_FMIN:
+ case ISD::VP_REDUCE_FMINIMUM:
return RISCVISD::VECREDUCE_FMIN_VL;
}
@@ -9786,8 +9791,10 @@ SDValue RISCVTargetLowering::lowerFPVECREDUCE(SDValue Op,
SDValue RISCVTargetLowering::lowerVPREDUCE(SDValue Op,
SelectionDAG &DAG) const {
SDLoc DL(Op);
+ unsigned Opc = Op.getOpcode();
SDValue Vec = Op.getOperand(1);
EVT VecEVT = Vec.getValueType();
+ MVT XLenVT = Subtarget.getXLenVT();
// TODO: The type may need to be widened rather than split. Or widened before
// it can be split.
@@ -9795,7 +9802,7 @@ SDValue RISCVTargetLowering::lowerVPREDUCE(SDValue Op,
return SDValue();
MVT VecVT = VecEVT.getSimpleVT();
- unsigned RVVOpcode = getRVVReductionOp(Op.getOpcode());
+ unsigned RVVOpcode = getRVVReductionOp(Opc);
if (VecVT.isFixedLengthVector()) {
auto ContainerVT = getContainerForFixedLengthVector(VecVT);
@@ -9804,8 +9811,25 @@ SDValue RISCVTargetLowering::lowerVPREDUCE(SDValue Op,
SDValue VL = Op.getOperand(3);
SDValue Mask = Op.getOperand(2);
- return lowerReductionSeq(RVVOpcode, Op.getSimpleValueType(), Op.getOperand(0),
- Vec, Mask, VL, DL, DAG, Subtarget);
+ SDValue Res =
+ lowerReductionSeq(RVVOpcode, Op.getSimpleValueType(), Op.getOperand(0),
+ Vec, Mask, VL, DL, DAG, Subtarget);
+ if ((Opc != ISD::VP_REDUCE_FMINIMUM && Opc != ISD::VP_REDUCE_FMAXIMUM) ||
+ Op->getFlags().hasNoNaNs())
+ return Res;
+
+ // Propagate NaNs.
+ MVT PredVT = getMaskTypeFor(Vec.getSimpleValueType());
+ SDValue IsNaN = DAG.getNode(
+ RISCVISD::SETCC_VL, DL, PredVT,
+ {Vec, Vec, DAG.getCondCode(ISD::SETNE), DAG.getUNDEF(PredVT), Mask, VL});
+ SDValue VCPop = DAG.getNode(RISCVISD::VCPOP_VL, DL, XLenVT, IsNaN, Mask, VL);
+ SDValue NoNaNs = DAG.getSetCC(DL, XLenVT, VCPop,
+ DAG.getConstant(0, DL, XLenVT), ISD::SETEQ);
+ MVT ResVT = Res.getSimpleValueType();
+ return DAG.getSelect(
+ DL, ResVT, NoNaNs, Res,
+ DAG.getConstantFP(DAG.EVTToAPFloatSemantics(ResVT), DL, ResVT));
}
SDValue RISCVTargetLowering::lowerINSERT_SUBVECTOR(SDValue Op,
diff --git a/llvm/test/CodeGen/Generic/expand-vp.ll b/llvm/test/CodeGen/Generic/expand-vp.ll
index 40d183273b86d..4fee9a533b947 100644
--- a/llvm/test/CodeGen/Generic/expand-vp.ll
+++ b/llvm/test/CodeGen/Generic/expand-vp.ll
@@ -41,6 +41,8 @@ declare i32 @llvm.vp.reduce.umin.v4i32(i32, <4 x i32>, <4 x i1>, i32)
declare i32 @llvm.vp.reduce.umax.v4i32(i32, <4 x i32>, <4 x i1>, i32)
declare float @llvm.vp.reduce.fmin.v4f32(float, <4 x float>, <4 x i1>, i32)
declare float @llvm.vp.reduce.fmax.v4f32(float, <4 x float>, <4 x i1>, i32)
+declare float @llvm.vp.reduce.fminimum.v4f32(float, <4 x float>, <4 x i1>, i32)
+declare float @llvm.vp.reduce.fmaximum.v4f32(float, <4 x float>, <4 x i1>, i32)
declare float @llvm.vp.reduce.fadd.v4f32(float, <4 x float>, <4 x i1>, i32)
declare float @llvm.vp.reduce.fmul.v4f32(float, <4 x float>, <4 x i1>, i32)
; Comparisons
@@ -133,10 +135,16 @@ define void @test_vp_reduce_fp_v4(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n
%r3 = call float @llvm.vp.reduce.fmax.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
%r4 = call nnan float @llvm.vp.reduce.fmax.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
%r5 = call nnan ninf float @llvm.vp.reduce.fmax.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
- %r6 = call float @llvm.vp.reduce.fadd.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
- %r7 = call reassoc float @llvm.vp.reduce.fadd.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
- %r8 = call float @llvm.vp.reduce.fmul.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
- %r9 = call reassoc float @llvm.vp.reduce.fmul.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r6 = call float @llvm.vp.reduce.fminimum.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r7 = call nnan float @llvm.vp.reduce.fminimum.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r8 = call nnan ninf float @llvm.vp.reduce.fminimum.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r9 = call float @llvm.vp.reduce.fmaximum.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r10 = call nnan float @llvm.vp.reduce.fmaximum.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r11 = call nnan ninf float @llvm.vp.reduce.fmaximum.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r12 = call float @llvm.vp.reduce.fadd.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r13 = call reassoc float @llvm.vp.reduce.fadd.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r14 = call float @llvm.vp.reduce.fmul.v4f32(float %f, <4 x float> %vf, <4 x i1> %m, i32 %n)
+ %r15 = call reassoc f...
[truncated]
|
You can test this locally with the following command:git-clang-format --diff 514d80b4feea3c788c1b0821959e96f63c8b8f3d 9e9510933dd398d8a312a543ed6c2ec30dd82c6d -- llvm/lib/CodeGen/ExpandVectorPredication.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/unittests/IR/VPIntrinsicTest.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 60985edd94..3bc9a38098 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -700,21 +700,44 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_SADDSAT, ISD::VP_UADDSAT, ISD::VP_SSUBSAT,
ISD::VP_USUBSAT, ISD::VP_CTTZ_ELTS, ISD::VP_CTTZ_ELTS_ZERO_UNDEF};
- static const unsigned FloatingPointVPOps[] = {
- ISD::VP_FADD, ISD::VP_FSUB, ISD::VP_FMUL,
- ISD::VP_FDIV, ISD::VP_FNEG, ISD::VP_FABS,
- ISD::VP_FMA, ISD::VP_REDUCE_FADD, ISD::VP_REDUCE_SEQ_FADD,
- ISD::VP_REDUCE_FMIN, ISD::VP_REDUCE_FMAX, ISD::VP_MERGE,
- ISD::VP_SELECT, ISD::VP_SINT_TO_FP, ISD::VP_UINT_TO_FP,
- ISD::VP_SETCC, ISD::VP_FP_ROUND, ISD::VP_FP_EXTEND,
- ISD::VP_SQRT, ISD::VP_FMINNUM, ISD::VP_FMAXNUM,
- ISD::VP_FCEIL, ISD::VP_FFLOOR, ISD::VP_FROUND,
- ISD::VP_FROUNDEVEN, ISD::VP_FCOPYSIGN, ISD::VP_FROUNDTOZERO,
- ISD::VP_FRINT, ISD::VP_FNEARBYINT, ISD::VP_IS_FPCLASS,
- ISD::VP_FMINIMUM, ISD::VP_FMAXIMUM, ISD::VP_LRINT,
- ISD::VP_LLRINT, ISD::EXPERIMENTAL_VP_REVERSE,
- ISD::EXPERIMENTAL_VP_SPLICE, ISD::VP_REDUCE_FMINIMUM,
- ISD::VP_REDUCE_FMAXIMUM};
+ static const unsigned FloatingPointVPOps[] = {ISD::VP_FADD,
+ ISD::VP_FSUB,
+ ISD::VP_FMUL,
+ ISD::VP_FDIV,
+ ISD::VP_FNEG,
+ ISD::VP_FABS,
+ ISD::VP_FMA,
+ ISD::VP_REDUCE_FADD,
+ ISD::VP_REDUCE_SEQ_FADD,
+ ISD::VP_REDUCE_FMIN,
+ ISD::VP_REDUCE_FMAX,
+ ISD::VP_MERGE,
+ ISD::VP_SELECT,
+ ISD::VP_SINT_TO_FP,
+ ISD::VP_UINT_TO_FP,
+ ISD::VP_SETCC,
+ ISD::VP_FP_ROUND,
+ ISD::VP_FP_EXTEND,
+ ISD::VP_SQRT,
+ ISD::VP_FMINNUM,
+ ISD::VP_FMAXNUM,
+ ISD::VP_FCEIL,
+ ISD::VP_FFLOOR,
+ ISD::VP_FROUND,
+ ISD::VP_FROUNDEVEN,
+ ISD::VP_FCOPYSIGN,
+ ISD::VP_FROUNDTOZERO,
+ ISD::VP_FRINT,
+ ISD::VP_FNEARBYINT,
+ ISD::VP_IS_FPCLASS,
+ ISD::VP_FMINIMUM,
+ ISD::VP_FMAXIMUM,
+ ISD::VP_LRINT,
+ ISD::VP_LLRINT,
+ ISD::EXPERIMENTAL_VP_REVERSE,
+ ISD::EXPERIMENTAL_VP_SPLICE,
+ ISD::VP_REDUCE_FMINIMUM,
+ ISD::VP_REDUCE_FMAXIMUM};
static const unsigned IntegerVecReduceOps[] = {
ISD::VECREDUCE_ADD, ISD::VECREDUCE_AND, ISD::VECREDUCE_OR,
@@ -950,16 +973,33 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::STRICT_FDIV, ISD::STRICT_FSQRT, ISD::STRICT_FMA};
// TODO: support more vp ops.
- static const unsigned ZvfhminPromoteVPOps[] = {
- ISD::VP_FADD, ISD::VP_FSUB, ISD::VP_FMUL,
- ISD::VP_FDIV, ISD::VP_FNEG, ISD::VP_FABS,
- ISD::VP_FMA, ISD::VP_REDUCE_FADD, ISD::VP_REDUCE_SEQ_FADD,
- ISD::VP_REDUCE_FMIN, ISD::VP_REDUCE_FMAX, ISD::VP_SQRT,
- ISD::VP_FMINNUM, ISD::VP_FMAXNUM, ISD::VP_FCEIL,
- ISD::VP_FFLOOR, ISD::VP_FROUND, ISD::VP_FROUNDEVEN,
- ISD::VP_FCOPYSIGN, ISD::VP_FROUNDTOZERO, ISD::VP_FRINT,
- ISD::VP_FNEARBYINT, ISD::VP_SETCC, ISD::VP_FMINIMUM,
- ISD::VP_FMAXIMUM, ISD::VP_REDUCE_FMINIMUM, ISD::VP_REDUCE_FMAXIMUM};
+ static const unsigned ZvfhminPromoteVPOps[] = {ISD::VP_FADD,
+ ISD::VP_FSUB,
+ ISD::VP_FMUL,
+ ISD::VP_FDIV,
+ ISD::VP_FNEG,
+ ISD::VP_FABS,
+ ISD::VP_FMA,
+ ISD::VP_REDUCE_FADD,
+ ISD::VP_REDUCE_SEQ_FADD,
+ ISD::VP_REDUCE_FMIN,
+ ISD::VP_REDUCE_FMAX,
+ ISD::VP_SQRT,
+ ISD::VP_FMINNUM,
+ ISD::VP_FMAXNUM,
+ ISD::VP_FCEIL,
+ ISD::VP_FFLOOR,
+ ISD::VP_FROUND,
+ ISD::VP_FROUNDEVEN,
+ ISD::VP_FCOPYSIGN,
+ ISD::VP_FROUNDTOZERO,
+ ISD::VP_FRINT,
+ ISD::VP_FNEARBYINT,
+ ISD::VP_SETCC,
+ ISD::VP_FMINIMUM,
+ ISD::VP_FMAXIMUM,
+ ISD::VP_REDUCE_FMINIMUM,
+ ISD::VP_REDUCE_FMAXIMUM};
// Sets common operation actions on RVV floating-point vector types.
const auto SetCommonVFPActions = [&](MVT VT) {
|
@@ -0,0 +1,205 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an overly generic test name. How are the other vp.reduce tests structured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consolidates the tests with other floating point vp.reduce tests.
- Check the start value of vp.reduce for NaN as well. - Consolidate the tests with other vp.reduce.* test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
vp.reduce.fmaximum/fminimum
are the VP version ofvector.reduce.fmaximum/fminimum
.