-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Handling tied operands in ternary pseudos in RISCVVLOptimizer #123760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-backend-risc-v Author: Luke Lau (lukel97)
TL;DR: we can't express tail undefined with ternary pseudos
I've been exploring using RISCVVLOptimizer to reduce VL toggles with EVL tail folding. In some of the benchmarks we fail to reduce the AVL of a ternary instruction, causing a vl toggle e.g.: vsetvli a4, zero, e32, m2, ta, ma
vsext.vf2 v14, v10
vsetvli zero, a1, e32, m2, ta, ma
vmadd.vv v14, v12, v8 Although we're able to reduce the AVL of the vmadd.vv itself, we can't reduce the AVL of the vsext.vf2 because its user, vmadd.vv, is a tied operand:
We don't allow users that are passthrus because the tail elements past std::optional<MachineOperand> RISCVVLOptimizer::checkUsers(MachineInstr &MI) {
for (auto &UserOp : MRI->use_operands(MI.getOperand(0).getReg())) {
...
// Tied operands might pass through.
if (UserOp.isTied()) {
LLVM_DEBUG(dbgs() << " Abort because user used as tied operand\n");
return std::nullopt;
} At first glance we might think that we can relax this restriction if the user's policy is ta/ma. However I don't think this is correct because tail agnostic/mask agnostic != tail undefined/mask undefined, so we still need to preserve the passthru to get "previous or -1" for each element. I think the problem comes down to how for binary pseudos we represent tail undefined/mask undefined by using NoRegister for the passthru. For ternary pseudos the third operand is simultaneously also the passthru, so it's not possible to represent a tail undefined/mask undefined pseudo with a non-undefined third operand. This means that we lose the information that the inactive elements are undefined during instruction selection. One possible solution could be to add new _INACTIVE_UNDEF pseudos with a TSFlag that marks this fact, and use these for regular SDNode patterns. Then RISCVVLOptimizer (and RISCVInsertVSETVLI) could pick up the TSFlag and allow it as a user. |
… demanded The motivation for this to allow reducing the vl when a user is a ternary pseudo, where the third operand is tied and also acts as a passthru. When checking the users of an instruction, we currently bail if the user is used as a passthru because all of its elements past vl will be used for the tail. We can allow passthru users if we know the tail of their result isn't used, and we can reuse checkUsers to check this. It's worth noting that this is all irrelevant of the tail policy, because tail agnostic still ends up using the passthru. I've checked that SPEC CPU 2017 + llvm-test-suite pass with this. Fixes llvm#123760
… demanded (#124066) The motivation for this to allow reducing the vl when a user is a ternary pseudo, where the third operand is tied and also acts as a passthru. When checking the users of an instruction, we currently bail if the user is used as a passthru because all of its elements past vl will be used for the tail. We can allow passthru users if we know the tail of their result isn't used, which we will have computed beforehand after #124530 It's worth noting that this is all irrelevant of the tail policy, because tail agnostic still ends up using the passthru. I've checked that SPEC CPU 2017 + llvm-test-suite pass with this (on qemu with rvv_ta_all_1s=true) Fixes #123760
TL;DR: we can't express tail undefined with ternary pseudos
I've been exploring using RISCVVLOptimizer to reduce VL toggles with EVL tail folding. In some of the benchmarks we fail to reduce the AVL of a ternary instruction, causing a vl toggle e.g.:
Although we're able to reduce the AVL of the vmadd.vv itself, we can't reduce the AVL of the vsext.vf2 because its user, vmadd.vv, is a tied operand:
We don't allow users that are passthrus because the tail elements past
vl
will depend on them.At first glance we might think that we can relax this restriction if the user's policy is ta/ma. However I don't think this is correct because tail agnostic/mask agnostic != tail undefined/mask undefined, so we still need to preserve the passthru to get "previous or -1" for each element.
I think the problem comes down to how for binary pseudos we represent tail undefined/mask undefined by using NoRegister for the passthru.
For ternary pseudos the third operand is simultaneously also the passthru, so it's not possible to represent a tail undefined/mask undefined pseudo with a non-undefined third operand.
This means that we lose the information that the inactive elements are undefined during instruction selection. One possible solution could be to add new _INACTIVE_UNDEF pseudos with a TSFlag that marks this fact, and use these for regular SDNode patterns. Then RISCVVLOptimizer (and RISCVInsertVSETVLI) could pick up the TSFlag and allow it as a user.
The text was updated successfully, but these errors were encountered: