-
Notifications
You must be signed in to change notification settings - Fork 15k
[TTI] Fix value-based BasicTTIImpl vp.{gather,scatter} costing #148020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After llvm#147677 we now preserve value based costing for vp intrinsics instead of switching it to type based costing. However for vp.gather and vp.scatter, even though they fallback to their functionally equivalent masked.gather and masked.scatter, the number of arguments are different due to the alignment being a dedicated argument. This caused a crash detected at https://lab.llvm.org/staging/#/builders/210/builds/988 Thix fixes it by explicitly handling the two intrinsics and adding test coverage. Note that the type based costing isn't yet implemented for masked.gather/masked.scatter so it doesn't show up correctly.
@llvm/pr-subscribers-llvm-analysis Author: Luke Lau (lukel97) ChangesAfter #147677 we now preserve value based costing for vp intrinsics instead of switching it to type based costing. However for vp.gather and vp.scatter, even though they fallback to their functionally equivalent masked.gather and masked.scatter, the number of arguments are different due to the alignment being a dedicated argument. This caused a crash detected at https://lab.llvm.org/staging/#/builders/210/builds/988 Thix fixes it by explicitly handling the two intrinsics and adding test coverage. Note that the type based costing isn't yet implemented for masked.gather/masked.scatter so it doesn't show up correctly. Full diff: https://github.com/llvm/llvm-project/pull/148020.diff 2 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 39955558fdd0a..6aee7fc299add 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -1773,6 +1773,39 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
}
}
+ if (ICA.getID() == Intrinsic::vp_scatter) {
+ if (ICA.isTypeBasedOnly()) {
+ IntrinsicCostAttributes MaskedScatter(
+ *VPIntrinsic::getFunctionalIntrinsicIDForVP(ICA.getID()),
+ ICA.getReturnType(), ArrayRef(ICA.getArgTypes()).drop_back(1),
+ ICA.getFlags());
+ return getTypeBasedIntrinsicInstrCost(MaskedScatter, CostKind);
+ }
+ Align Alignment;
+ if (auto *VPI = dyn_cast_or_null<VPIntrinsic>(ICA.getInst()))
+ Alignment = VPI->getPointerAlignment().valueOrOne();
+ bool VarMask = isa<Constant>(ICA.getArgs()[2]);
+ return thisT()->getGatherScatterOpCost(
+ Instruction::Store, ICA.getArgTypes()[0], ICA.getArgs()[1], VarMask,
+ Alignment, CostKind, nullptr);
+ }
+ if (ICA.getID() == Intrinsic::vp_gather) {
+ if (ICA.isTypeBasedOnly()) {
+ IntrinsicCostAttributes MaskedGather(
+ *VPIntrinsic::getFunctionalIntrinsicIDForVP(ICA.getID()),
+ ICA.getReturnType(), ArrayRef(ICA.getArgTypes()).drop_back(1),
+ ICA.getFlags());
+ return getTypeBasedIntrinsicInstrCost(MaskedGather, CostKind);
+ }
+ Align Alignment;
+ if (auto *VPI = dyn_cast_or_null<VPIntrinsic>(ICA.getInst()))
+ Alignment = VPI->getPointerAlignment().valueOrOne();
+ bool VarMask = isa<Constant>(ICA.getArgs()[1]);
+ return thisT()->getGatherScatterOpCost(
+ Instruction::Load, ICA.getReturnType(), ICA.getArgs()[0], VarMask,
+ Alignment, CostKind, nullptr);
+ }
+
if (ICA.getID() == Intrinsic::vp_select ||
ICA.getID() == Intrinsic::vp_merge) {
TTI::OperandValueInfo OpInfoX, OpInfoY;
diff --git a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
index 3a2e7d5580ac0..3701d213b5e8b 100644
--- a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
@@ -978,6 +978,122 @@ define void @store() {
ret void
}
+define void @gather() {
+; ARGBASED-LABEL: 'gather'
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %1 = call <2 x i8> @llvm.vp.gather.v2i8.v2p0(<2 x ptr> poison, <2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %2 = call <4 x i8> @llvm.vp.gather.v4i8.v4p0(<4 x ptr> poison, <4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %3 = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr> poison, <8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %4 = call <16 x i8> @llvm.vp.gather.v16i8.v16p0(<16 x ptr> poison, <16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %5 = call <2 x i64> @llvm.vp.gather.v2i64.v2p0(<2 x ptr> poison, <2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %6 = call <4 x i64> @llvm.vp.gather.v4i64.v4p0(<4 x ptr> poison, <4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 69 for instruction: %7 = call <8 x i64> @llvm.vp.gather.v8i64.v8p0(<8 x ptr> poison, <8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 141 for instruction: %8 = call <16 x i64> @llvm.vp.gather.v16i64.v16p0(<16 x ptr> poison, <16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = call <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = call <vscale x 4 x i8> @llvm.vp.gather.nxv4i8.nxv4p0(<vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %11 = call <vscale x 8 x i8> @llvm.vp.gather.nxv8i8.nxv8p0(<vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %12 = call <vscale x 16 x i8> @llvm.vp.gather.nxv16i8.nxv16p0(<vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %13 = call <vscale x 2 x i64> @llvm.vp.gather.nxv2i64.nxv2p0(<vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %14 = call <vscale x 4 x i64> @llvm.vp.gather.nxv4i64.nxv4p0(<vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %15 = call <vscale x 8 x i64> @llvm.vp.gather.nxv8i64.nxv8p0(<vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %16 = call <vscale x 16 x i64> @llvm.vp.gather.nxv16i64.nxv16p0(<vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'gather'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %1 = call <2 x i8> @llvm.vp.gather.v2i8.v2p0(<2 x ptr> poison, <2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 30 for instruction: %2 = call <4 x i8> @llvm.vp.gather.v4i8.v4p0(<4 x ptr> poison, <4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 62 for instruction: %3 = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr> poison, <8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 126 for instruction: %4 = call <16 x i8> @llvm.vp.gather.v16i8.v16p0(<16 x ptr> poison, <16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %5 = call <2 x i64> @llvm.vp.gather.v2i64.v2p0(<2 x ptr> poison, <2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %6 = call <4 x i64> @llvm.vp.gather.v4i64.v4p0(<4 x ptr> poison, <4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 69 for instruction: %7 = call <8 x i64> @llvm.vp.gather.v8i64.v8p0(<8 x ptr> poison, <8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 141 for instruction: %8 = call <16 x i64> @llvm.vp.gather.v16i64.v16p0(<16 x ptr> poison, <16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %9 = call <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %10 = call <vscale x 4 x i8> @llvm.vp.gather.nxv4i8.nxv4p0(<vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %11 = call <vscale x 8 x i8> @llvm.vp.gather.nxv8i8.nxv8p0(<vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %12 = call <vscale x 16 x i8> @llvm.vp.gather.nxv16i8.nxv16p0(<vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %13 = call <vscale x 2 x i64> @llvm.vp.gather.nxv2i64.nxv2p0(<vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %14 = call <vscale x 4 x i64> @llvm.vp.gather.nxv4i64.nxv4p0(<vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %15 = call <vscale x 8 x i64> @llvm.vp.gather.nxv8i64.nxv8p0(<vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %16 = call <vscale x 16 x i64> @llvm.vp.gather.nxv16i64.nxv16p0(<vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ call <2 x i8> @llvm.vp.gather(<2 x ptr> poison, <2 x i1> poison, i32 poison)
+ call <4 x i8> @llvm.vp.gather(<4 x ptr> poison, <4 x i1> poison, i32 poison)
+ call <8 x i8> @llvm.vp.gather(<8 x ptr> poison, <8 x i1> poison, i32 poison)
+ call <16 x i8> @llvm.vp.gather(<16 x ptr> poison, <16 x i1> poison, i32 poison)
+ call <2 x i64> @llvm.vp.gather(<2 x ptr> poison, <2 x i1> poison, i32 poison)
+ call <4 x i64> @llvm.vp.gather(<4 x ptr> poison, <4 x i1> poison, i32 poison)
+ call <8 x i64> @llvm.vp.gather(<8 x ptr> poison, <8 x i1> poison, i32 poison)
+ call <16 x i64> @llvm.vp.gather(<16 x ptr> poison, <16 x i1> poison, i32 poison)
+ call <vscale x 2 x i8> @llvm.vp.gather(<vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+ call <vscale x 4 x i8> @llvm.vp.gather(<vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+ call <vscale x 8 x i8> @llvm.vp.gather(<vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+ call <vscale x 16 x i8> @llvm.vp.gather(<vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+ call <vscale x 2 x i64> @llvm.vp.gather(<vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+ call <vscale x 4 x i64> @llvm.vp.gather(<vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+ call <vscale x 8 x i64> @llvm.vp.gather(<vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+ call <vscale x 16 x i64> @llvm.vp.gather(<vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+ ret void
+}
+
+define void @scatter() {
+; ARGBASED-LABEL: 'scatter'
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.vp.scatter.v2i8.v2p0(<2 x i8> poison, <2 x ptr> poison, <2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.vp.scatter.v4i8.v4p0(<4 x i8> poison, <4 x ptr> poison, <4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> poison, <8 x ptr> poison, <8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.vp.scatter.v16i8.v16p0(<16 x i8> poison, <16 x ptr> poison, <16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 15 for instruction: call void @llvm.vp.scatter.v2i64.v2p0(<2 x i64> poison, <2 x ptr> poison, <2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: call void @llvm.vp.scatter.v4i64.v4p0(<4 x i64> poison, <4 x ptr> poison, <4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 69 for instruction: call void @llvm.vp.scatter.v8i64.v8p0(<8 x i64> poison, <8 x ptr> poison, <8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 141 for instruction: call void @llvm.vp.scatter.v16i64.v16p0(<16 x i64> poison, <16 x ptr> poison, <16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> poison, <vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.vp.scatter.nxv4i8.nxv4p0(<vscale x 4 x i8> poison, <vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.vp.scatter.nxv8i8.nxv8p0(<vscale x 8 x i8> poison, <vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.vp.scatter.nxv16i8.nxv16p0(<vscale x 16 x i8> poison, <vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv2i64.nxv2p0(<vscale x 2 x i64> poison, <vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv4i64.nxv4p0(<vscale x 4 x i64> poison, <vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv8i64.nxv8p0(<vscale x 8 x i64> poison, <vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv16i64.nxv16p0(<vscale x 16 x i64> poison, <vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'scatter'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 15 for instruction: call void @llvm.vp.scatter.v2i8.v2p0(<2 x i8> poison, <2 x ptr> poison, <2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: call void @llvm.vp.scatter.v4i8.v4p0(<4 x i8> poison, <4 x ptr> poison, <4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 69 for instruction: call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> poison, <8 x ptr> poison, <8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 141 for instruction: call void @llvm.vp.scatter.v16i8.v16p0(<16 x i8> poison, <16 x ptr> poison, <16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 15 for instruction: call void @llvm.vp.scatter.v2i64.v2p0(<2 x i64> poison, <2 x ptr> poison, <2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: call void @llvm.vp.scatter.v4i64.v4p0(<4 x i64> poison, <4 x ptr> poison, <4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 69 for instruction: call void @llvm.vp.scatter.v8i64.v8p0(<8 x i64> poison, <8 x ptr> poison, <8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 141 for instruction: call void @llvm.vp.scatter.v16i64.v16p0(<16 x i64> poison, <16 x ptr> poison, <16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> poison, <vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv4i8.nxv4p0(<vscale x 4 x i8> poison, <vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv8i8.nxv8p0(<vscale x 8 x i8> poison, <vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv16i8.nxv16p0(<vscale x 16 x i8> poison, <vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv2i64.nxv2p0(<vscale x 2 x i64> poison, <vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv4i64.nxv4p0(<vscale x 4 x i64> poison, <vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv8i64.nxv8p0(<vscale x 8 x i64> poison, <vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: call void @llvm.vp.scatter.nxv16i64.nxv16p0(<vscale x 16 x i64> poison, <vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ call void @llvm.vp.scatter(<2 x i8> poison, <2 x ptr> poison, <2 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<4 x i8> poison, <4 x ptr> poison, <4 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<8 x i8> poison, <8 x ptr> poison, <8 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<16 x i8> poison, <16 x ptr> poison, <16 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<2 x i64> poison, <2 x ptr> poison, <2 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<4 x i64> poison, <4 x ptr> poison, <4 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<8 x i64> poison, <8 x ptr> poison, <8 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<16 x i64> poison, <16 x ptr> poison, <16 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 2 x i8> poison, <vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 4 x i8> poison, <vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 8 x i8> poison, <vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 16 x i8> poison, <vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 2 x i64> poison, <vscale x 2 x ptr> poison, <vscale x 2 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 4 x i64> poison, <vscale x 4 x ptr> poison, <vscale x 4 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 8 x i64> poison, <vscale x 8 x ptr> poison, <vscale x 8 x i1> poison, i32 poison)
+ call void @llvm.vp.scatter(<vscale x 16 x i64> poison, <vscale x 16 x ptr> poison, <vscale x 16 x i1> poison, i32 poison)
+ ret void
+}
+
define void @strided_load() {
; ARGBASED-LABEL: 'strided_load'
; ARGBASED-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not the most qualified reviewer for this specific change, but the reasoning makes sense to me and I'm keen to unbreak the bots (and it doesn't seem complex enough that revert and a less time pressured review is really worthwhile).
After #147677 we now preserve value based costing for vp intrinsics instead of switching it to type based costing.
However for vp.gather and vp.scatter, even though they fallback to their functionally equivalent masked.gather and masked.scatter, the number of arguments are different due to the alignment being a dedicated argument.
This caused a crash detected at https://lab.llvm.org/staging/#/builders/210/builds/988
Thix fixes it by explicitly handling the two intrinsics and adding test coverage.
Note that the type based costing isn't yet implemented for masked.gather/masked.scatter so it doesn't show up correctly.