Skip to content

[VPlan] Add VPPhiAccessors to provide interface for phi recipes (NFC) #129388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 4, 2025
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2883,11 +2883,8 @@ void InnerLoopVectorizer::fixNonInductionPHIs(VPTransformState &State) {
PHINode *NewPhi = cast<PHINode>(State.get(VPPhi));
// Make sure the builder has a valid insert point.
Builder.SetInsertPoint(NewPhi);
for (unsigned Idx = 0; Idx < VPPhi->getNumOperands(); ++Idx) {
VPValue *Inc = VPPhi->getIncomingValue(Idx);
VPBasicBlock *VPBB = VPPhi->getIncomingBlock(Idx);
for (const auto &[Inc, VPBB] : VPPhi->incoming_values_and_blocks())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Perhaps better to rename Inc to Idx? It sounds like a short form of increment or incoming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a short form of incoming value, while Idx would refer to an Index?

NewPhi->addIncoming(State.get(Inc), State.CFG.VPBB2IRBB[VPBB]);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this return (in simplification?) worth the investment. Would an API of getIncomingValue(Idx), getIncomingBlock(Idx), and possibly getNumIncomings() common to all phi recipes suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be additional users when updating more recipes to use the accessors; they are more conveiient, but getIncomingValue and getIncomingBlock would also suffice, at the cost of being a bit less convenient.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about first wrapping getIncomingValue and getIncomingBlock within VPPhiAccessors, and introduce the additional zipped iterator API as a separate follow-up. That would help clarify the convenience versus investment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, done for now, thanks

}
}
}
Expand Down
62 changes: 55 additions & 7 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -1171,6 +1171,59 @@ class VPIRInstruction : public VPRecipeBase {
void extractLastLaneOfFirstOperand(VPBuilder &Builder);
};

/// Helper type to provide functions to access incoming values and blocks for
/// phi-like recipes. RecipeTy must be a sub-class of VPRecipeBase.
template <typename RecipeTy> class VPPhiAccessors {
/// Return a VPRecipeBase* to the current object.
const VPRecipeBase *getAsRecipe() const {
return static_cast<const RecipeTy *>(this);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the templating really needed, can getAsRecipe() simply down cast this to VPRecipeBase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep it seems so, problem is that w/o the template there's no inheritance relation here, and static_casts are rejected (Curiously Recurring Template Pattern). For some reason, we need to cast exactly to the type.

But the template parameter may cause problems in the future, so I updated getAsRecipe to be pure virtual to be implemented by the derived classes. The single implementation could also be used for other trait classes in the future. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case enabled by getting rid of the template argument is supporting dyn_cast from VPRecipeBase -> VPPhiAccessors, as used in #124838 ( https://github.com/llvm/llvm-project/pull/124838/files#diff-a69094b5fcfce6b2bf9e957e2ac7011e5492e81c885129506c90874375e621fbR210) by implementing CastInfo<VPPhiAccessors, const VPRecipeBase *>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would having getAsRecipe() do a dyn_cast to VPRecipeBase be ok? Possibly followed by an assert...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we could dyn_cast from VPPhiAccessors to VPRecipeBase because there's no directy relationship between them (and the this pointer for the VPPhiAccessors (sub-)object may not be the same as the VPRecipeBase (base-) object. The other way works, because we can down-cast to the concrete types inheriting from VPPhiAccessors).

}

public:
/// Returns the incoming VPValue with index \p Idx.
VPValue *getIncomingValue(unsigned Idx) const {
return getAsRecipe()->getOperand(Idx);
}
Comment on lines +1180 to +1182
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about VPPhiAccessors having both getIncomingBlock(Idx) and getIncomingValue(Idx) be pure virtual, and have another interim VPSingleDefPhiRecipe inherit from both VPSingleDefRecipe and VPPhiAccessors take care of implementing getIncomingValue(Idx) for all (singleDef) phi recipes, instead of getAsRecipe() subclass casting. Are all recipes candidate of inheritance singleDef?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All recipes are single defs, but we now unfortunately have some recipes (e.g. VPIRPhi) where the base class VPIRInstruction inherits from VPSingleDefRecipe, but inheriting from VPSinglePhiDefRecipe would not be approriate, hence the trait/mix-in. Down the road, we could also support casting any recipe that supports it to VPPhiAccessors, e.g. for verifying all phi-like nodes that implement the trait.

Alternatively we could manually add definitions of getIncomingBlock and getIncomingValue to all relevant classes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, who are all the relevant classes - expected to inherit directly from VPPhiAccessors in addition to VPWidenPHIRecipe (who could use VPSingleDefPhiRecipe with other potential partners) and VPIRPhi?

If VPIRPhi inherits directly from VPPhiAccessors, could it implement getIncomingBlock based on the direct predecessors of its VPBasicBlock, as it is not used to represent header phi's of HCFG regions? I.e., assert it has direct predecessors.

In any case, good to implement both getIncomingBlock and getIncomingValue by the mix-in, as done here, or neither (and have both defined by all derived classes instead).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need it for VPWidenPHIRecipe, VPHeaderPHIRecipe, VPIRPhi, VPEVLBasedPhi and VPPhi (scalar phis via VPInstruction, probably via a new specialization).

Define both getIncomingBlock and getIncomingValue there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another one is VPPredInstPhi

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VPWidenPHIRecipe, VPHeaderPHIRecipe (base class of VPEVLBasedPhi), and VPPredInstPhi all inherit from VPSingleDefRecipe. So could inherit from VPSingleDefPhiRecipe instead, which could take care of implementing these pure virtual methods for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the getAsRecipe pure-virtual to avoid the template argument + static cast. WDYT?


/// Returns an interator range over the incoming values
VPUser::const_operand_range incoming_values() const {
return getAsRecipe()->operands();
}

/// Returns the incoming block with index \p Idx.
const VPBasicBlock *getIncomingBlock(unsigned Idx) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline as well? Good to see its implementation next to that of getIncomingValue() and getNumIncoming().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs access to VPBasicBlock's definition, which isn't available here; could be resolved by moving VPBlock* definitions to a separate header possibly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, probably better done separately, if at all.


using const_incoming_block_iterator =
mapped_iterator<detail::index_iterator,
std::function<const VPBasicBlock *(size_t)>>;
using const_incoming_blocks_range =
iterator_range<const_incoming_block_iterator>;

const_incoming_block_iterator incoming_block_begin() const {
return const_incoming_block_iterator(
detail::index_iterator(0),
[this](size_t Idx) { return getIncomingBlock(Idx); });
}
const_incoming_block_iterator incoming_block_end() const {
return const_incoming_block_iterator(
detail::index_iterator(getAsRecipe()->getNumOperands()),
[this](size_t Idx) { return getIncomingBlock(Idx); });
}

/// Returns an iterator range over the incoming blocks.
const_incoming_blocks_range incoming_blocks() const {
return make_range(incoming_block_begin(), incoming_block_end());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If/when predecessors already have a natural iterator, would building an index_iterator from retrieving each block get folded into the former?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can check once we bring those iterators back


/// Returns an iterator range over pairs of incoming values and corresponding
/// incoming blocks.
detail::zippy<llvm::detail::zip_shortest, VPUser::const_operand_range,
const_incoming_blocks_range>
incoming_values_and_blocks() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suffice to keep this method public, as getIncomingValuesAndBlocks(), complementing getIncomingValue(Idx) and getIncomingBlock(Idx)? Trying to reduce non_/camelCase inconsistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now lets just add getIncomingValue and getIncomingBlock

return zip(incoming_values(), incoming_blocks());
}
};

/// An overlay for VPIRInstructions wrapping PHI nodes enabling convenient use
/// cast/dyn_cast/isa and execute() implementation. A single VPValue operand is
/// allowed, and it is used to add a new incoming value for the single
Expand Down Expand Up @@ -1976,7 +2029,8 @@ class VPWidenPointerInductionRecipe : public VPWidenInductionRecipe,
/// recipe is placed in an entry block to a (non-replicate) region, it must have
/// exactly 2 incoming values, the first from the predecessor of the region and
/// the second from the exiting block of the region.
class VPWidenPHIRecipe : public VPSingleDefRecipe {
class VPWidenPHIRecipe : public VPSingleDefRecipe,
public VPPhiAccessors<VPWidenPHIRecipe> {
/// Name to use for the generated IR instruction for the widened phi.
std::string Name;

Expand Down Expand Up @@ -2007,12 +2061,6 @@ class VPWidenPHIRecipe : public VPSingleDefRecipe {
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
#endif

/// Returns the \p I th incoming VPBasicBlock.
VPBasicBlock *getIncomingBlock(unsigned I);

/// Returns the \p I th incoming VPValue.
VPValue *getIncomingValue(unsigned I) { return getOperand(I); }
};

/// A recipe for handling first-order recurrence phis. The start value is the
Expand Down
49 changes: 30 additions & 19 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1205,6 +1205,32 @@ void VPIRPhi::print(raw_ostream &O, const Twine &Indent,
}
#endif

/// Returns the incoming block at index \p Idx for \p R. This handles both
/// recipes placed in entry blocks of loop regions (incoming blocks are the
/// region's predecessor and the region's exit) and other locations (incoming
/// blocks are the direct predecessors).
static const VPBasicBlock *getIncomingBlockForRecipe(const VPRecipeBase *R,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth using the same Idx variable name as getIncomingBlock for consistency? Or vice-versa - rename all instances of Idx to I?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use Idx consistently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be getCFGPredecessors(VPBasicBlock) or VPBasicBlock:: getCFGPredecessors(), as iterators and/or single element by index? Complementing getHierarchicalPredecessors() and getPredecessors().

OTOH, if used only by VPPhiAccessors::getIncomingBlock(Idx), better inline it there. But by templating VPPhiAccessors, this would need to be replicated per instance, as in VPPhiAccessors<VPWidenPHIRecipe>::getIncomingBlock(Idx)?

This provides the CFG predecessor basic-blocks of a given block (rather than recipe), which could be in CFG mode (in which case they are held explicitly, can cast them from block to basic-block) or HCFG mode (in which case region header blocks need to collect their region's predecessor('s exiting) basic-block and exiting basic-block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep had a similar thought, but wasn't sure what a good name would be. Added as getCFGPredecessor(Idx) to start with. Could add an iterator version separately when other iterators are added

unsigned Idx) {
const VPBasicBlock *Parent = R->getParent();
const VPBlockBase *Pred = nullptr;
if (Parent->getNumPredecessors() > 0) {
Pred = Parent->getPredecessors()[Idx];
} else {
auto *Region = Parent->getParent();
assert(Region && !Region->isReplicator() && Region->getEntry() == Parent &&
"must be in the entry block of a non-replicate region");
assert(
Idx < 2 && R->getNumOperands() == 2 &&
"when placed in an entry block, only 2 incoming blocks are available");

// Idx == 0 selects the predecessor of the region, Idx == 1 selects the
// region itself whose exiting block feeds the phi across the backedge.
Pred = Idx == 0 ? Region->getSinglePredecessor() : Region;
}

return Pred->getExitingBasicBlock();
}

void VPIRMetadata::applyMetadata(Instruction &I) const {
for (const auto &[Kind, Node] : Metadata)
I.setMetadata(Kind, Node);
Expand Down Expand Up @@ -3694,25 +3720,10 @@ void VPReductionPHIRecipe::print(raw_ostream &O, const Twine &Indent,
}
#endif

VPBasicBlock *VPWidenPHIRecipe::getIncomingBlock(unsigned I) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better keep delegating the implementation of getIncomingBlock(Idx) to recipes that inherit from VPPhiAccessors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now providing a specialized definition of VPPhiAccessors<VPWidenPHIRecipe>::getIncomingBlock, which is needed to instantiate the template for the type and use the generic getIncomingBlockForRecipe

VPBasicBlock *Parent = getParent();
VPBlockBase *Pred = nullptr;
if (Parent->getNumPredecessors() > 0) {
Pred = Parent->getPredecessors()[I];
} else {
auto *Region = Parent->getParent();
assert(Region && !Region->isReplicator() && Region->getEntry() == Parent &&
"must be in the entry block of a non-replicate region");
assert(
I < 2 && getNumOperands() == 2 &&
"when placed in an entry block, only 2 incoming blocks are available");

// I == 0 selects the predecessor of the region, I == 1 selects the region
// itself whose exiting block feeds the phi across the backedge.
Pred = I == 0 ? Region->getSinglePredecessor() : Region;
}

return Pred->getExitingBasicBlock();
template <>
const VPBasicBlock *
VPPhiAccessors<VPWidenPHIRecipe>::getIncomingBlock(unsigned Idx) const {
return getIncomingBlockForRecipe(getAsRecipe(), Idx);
}

void VPWidenPHIRecipe::execute(VPTransformState &State) {
Expand Down
Loading