Skip to content

Commit e3534d3

Browse files
committed
[RISCV] Shrink vslidedown when lowering fixed extract_subvector
As noted in llvm#65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in llvm#65392, with the hope that we'll be able to generalize it later.
1 parent b46d701 commit e3534d3

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8751,6 +8751,39 @@ SDValue RISCVTargetLowering::lowerEXTRACT_SUBVECTOR(SDValue Op,
87518751
ContainerVT = getContainerForFixedLengthVector(VecVT);
87528752
Vec = convertToScalableVector(ContainerVT, Vec, DAG, Subtarget);
87538753
}
8754+
8755+
// The minimum number of elements for a scalable vector type, e.g. nxv1i32
8756+
// is not legal on Zve32x.
8757+
const uint64_t MinLegalNumElts =
8758+
RISCV::RVVBitsPerBlock / Subtarget.getELen();
8759+
const uint64_t MinVscale =
8760+
Subtarget.getRealMinVLen() / RISCV::RVVBitsPerBlock;
8761+
8762+
// Even if we don't know the exact subregister the subvector is going to
8763+
// reside in, we know that the subvector is located within the first N bits
8764+
// of Vec:
8765+
//
8766+
// N = (OrigIdx + SubVecVT.getVectorNumElements()) * EltSizeInBits
8767+
// = MinVscale * MinEltsNeeded * EltSizeInBits
8768+
//
8769+
// From this we can work out the smallest type that contains everything we
8770+
// need to extract, <vscale x MinEltsNeeded x Elt>
8771+
uint64_t MinEltsNeeded =
8772+
(OrigIdx + SubVecVT.getVectorNumElements()) / MinVscale;
8773+
8774+
// Round up the number of elements so it's a valid power of 2 scalable
8775+
// vector type, and make sure it's not less than smallest legal vector type.
8776+
MinEltsNeeded = std::max(MinLegalNumElts, PowerOf2Ceil(MinEltsNeeded));
8777+
8778+
assert(MinEltsNeeded <= ContainerVT.getVectorMinNumElements());
8779+
8780+
// Shrink down Vec so we're performing the slidedown on the smallest
8781+
// possible type.
8782+
ContainerVT = MVT::getScalableVectorVT(ContainerVT.getVectorElementType(),
8783+
MinEltsNeeded);
8784+
Vec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, ContainerVT, Vec,
8785+
DAG.getVectorIdxConstant(0, DL));
8786+
87548787
SDValue Mask =
87558788
getDefaultVLOps(VecVT, ContainerVT, DL, DAG, Subtarget).first;
87568789
// Set the vector length to only the number of elements we care about. This

0 commit comments

Comments
 (0)