Skip to content

Handling tied operands in ternary pseudos in RISCVVLOptimizer #123760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lukel97 opened this issue Jan 21, 2025 · 1 comment · Fixed by #124066
Closed

Handling tied operands in ternary pseudos in RISCVVLOptimizer #123760

lukel97 opened this issue Jan 21, 2025 · 1 comment · Fixed by #124066
Assignees

Comments

@lukel97
Copy link
Contributor

lukel97 commented Jan 21, 2025

TL;DR: we can't express tail undefined with ternary pseudos

I've been exploring using RISCVVLOptimizer to reduce VL toggles with EVL tail folding. In some of the benchmarks we fail to reduce the AVL of a ternary instruction, causing a vl toggle e.g.:

        vsetvli a4, zero, e32, m2, ta, ma
        vsext.vf2       v14, v10
        vsetvli zero, a1, e32, m2, ta, ma
        vmadd.vv        v14, v12, v8

Although we're able to reduce the AVL of the vmadd.vv itself, we can't reduce the AVL of the vsext.vf2 because its user, vmadd.vv, is a tied operand:

Trying to reduce VL for early-clobber %43:vrm2 = PseudoVSEXT_VF2_M2 $noreg(tied-
def 0), killed %42:vr, -1, 5, 3

  Checking user: %49:vrm2 = nsw PseudoVMADD_VV_M2 %43:vrm2(tied-def 0), killed %
47:vrm2, %48:vrm2, %40:gprnox0, 5, 1

    Abort because user used as tied operand

We don't allow users that are passthrus because the tail elements past vl will depend on them.

std::optional<MachineOperand> RISCVVLOptimizer::checkUsers(MachineInstr &MI) {
  for (auto &UserOp : MRI->use_operands(MI.getOperand(0).getReg())) {
    ...
    // Tied operands might pass through.
    if (UserOp.isTied()) {
      LLVM_DEBUG(dbgs() << "    Abort because user used as tied operand\n");
      return std::nullopt;
    }

At first glance we might think that we can relax this restriction if the user's policy is ta/ma. However I don't think this is correct because tail agnostic/mask agnostic != tail undefined/mask undefined, so we still need to preserve the passthru to get "previous or -1" for each element.

I think the problem comes down to how for binary pseudos we represent tail undefined/mask undefined by using NoRegister for the passthru.

For ternary pseudos the third operand is simultaneously also the passthru, so it's not possible to represent a tail undefined/mask undefined pseudo with a non-undefined third operand.

This means that we lose the information that the inactive elements are undefined during instruction selection. One possible solution could be to add new _INACTIVE_UNDEF pseudos with a TSFlag that marks this fact, and use these for regular SDNode patterns. Then RISCVVLOptimizer (and RISCVInsertVSETVLI) could pick up the TSFlag and allow it as a user.

@llvmbot
Copy link
Member

llvmbot commented Jan 21, 2025

@llvm/issue-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

TL;DR: we can't express tail undefined with ternary pseudos

I've been exploring using RISCVVLOptimizer to reduce VL toggles with EVL tail folding. In some of the benchmarks we fail to reduce the AVL of a ternary instruction, causing a vl toggle e.g.:

        vsetvli a4, zero, e32, m2, ta, ma
        vsext.vf2       v14, v10
        vsetvli zero, a1, e32, m2, ta, ma
        vmadd.vv        v14, v12, v8

Although we're able to reduce the AVL of the vmadd.vv itself, we can't reduce the AVL of the vsext.vf2 because its user, vmadd.vv, is a tied operand:

Trying to reduce VL for early-clobber %43:vrm2 = PseudoVSEXT_VF2_M2 $noreg(tied-
def 0), killed %42:vr, -1, 5, 3

  Checking user: %49:vrm2 = nsw PseudoVMADD_VV_M2 %43:vrm2(tied-def 0), killed %
47:vrm2, %48:vrm2, %40:gprnox0, 5, 1

    Abort because user used as tied operand

We don't allow users that are passthrus because the tail elements past vl will depend on them.

std::optional&lt;MachineOperand&gt; RISCVVLOptimizer::checkUsers(MachineInstr &amp;MI) {
  for (auto &amp;UserOp : MRI-&gt;use_operands(MI.getOperand(0).getReg())) {
    ...
    // Tied operands might pass through.
    if (UserOp.isTied()) {
      LLVM_DEBUG(dbgs() &lt;&lt; "    Abort because user used as tied operand\n");
      return std::nullopt;
    }

At first glance we might think that we can relax this restriction if the user's policy is ta/ma. However I don't think this is correct because tail agnostic/mask agnostic != tail undefined/mask undefined, so we still need to preserve the passthru to get "previous or -1" for each element.

I think the problem comes down to how for binary pseudos we represent tail undefined/mask undefined by using NoRegister for the passthru.

For ternary pseudos the third operand is simultaneously also the passthru, so it's not possible to represent a tail undefined/mask undefined pseudo with a non-undefined third operand.

This means that we lose the information that the inactive elements are undefined during instruction selection. One possible solution could be to add new _INACTIVE_UNDEF pseudos with a TSFlag that marks this fact, and use these for regular SDNode patterns. Then RISCVVLOptimizer (and RISCVInsertVSETVLI) could pick up the TSFlag and allow it as a user.

@lukel97 lukel97 changed the title Handling tied operands in RISCVVLOptimizer Handling tied operands in ternary pseudos in RISCVVLOptimizer Jan 21, 2025
@lukel97 lukel97 self-assigned this Jan 22, 2025
lukel97 added a commit to lukel97/llvm-project that referenced this issue Jan 23, 2025
… demanded

The motivation for this to allow reducing the vl when a user is a ternary pseudo, where the third operand is tied and also acts as a passthru.

When checking the users of an instruction, we currently bail if the user is used as a passthru because all of its elements past vl will be used for the tail.

We can allow passthru users if we know the tail of their result isn't used, and we can reuse checkUsers to check this.

It's worth noting that this is all irrelevant of the tail policy, because tail agnostic still ends up using the passthru.

I've checked that SPEC CPU 2017 + llvm-test-suite pass with this.

Fixes llvm#123760
lukel97 added a commit that referenced this issue Jan 30, 2025
… demanded (#124066)

The motivation for this to allow reducing the vl when a user is a
ternary pseudo, where the third operand is tied and also acts as a
passthru.

When checking the users of an instruction, we currently bail if the user
is used as a passthru because all of its elements past vl will be used
for the tail.

We can allow passthru users if we know the tail of their result isn't
used, which we will have computed beforehand after #124530

It's worth noting that this is all irrelevant of the tail policy,
because tail agnostic still ends up using the passthru.

I've checked that SPEC CPU 2017 + llvm-test-suite pass with this (on
qemu with rvv_ta_all_1s=true)

Fixes #123760
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants