-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV] Reverse (add x, (zext c)) back to (select c, (add x, 1), x) #87236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
fb91c9d
8dd319b
994521c
54880bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -52,12 +52,60 @@ class RISCVCodeGenPrepare : public FunctionPass, | |
} | ||
|
||
bool visitInstruction(Instruction &I) { return false; } | ||
bool visitBinaryOperator(BinaryOperator &BO); | ||
bool visitAnd(BinaryOperator &BO); | ||
bool visitIntrinsicInst(IntrinsicInst &I); | ||
}; | ||
|
||
} // end anonymous namespace | ||
|
||
/// InstCombine will canonicalize selects of binary ops where the identity is | ||
/// zero to zexts: | ||
/// | ||
/// select c, (add x, 1), x -> add x, (zext c) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we also handle |
||
/// | ||
/// On RISC-V though, a zext of an i1 vector will be lowered as a vmv.v.i and a | ||
/// vmerge.vim: | ||
/// | ||
/// vmv.v.i v12, 0 | ||
/// vmerge.vim v9, v12, 1, v0 | ||
/// vadd.vv v8, v8, v9 | ||
/// | ||
/// Reverse this transform so that we pull the select outside of the binary op, | ||
/// which allows us to fold it into a masked op: | ||
/// | ||
/// vadd.vi v8, v8, 1, v0.t | ||
bool RISCVCodeGenPrepare::visitBinaryOperator(BinaryOperator &BO) { | ||
if (!BO.getType()->isVectorTy()) | ||
return false; | ||
|
||
// TODO: We could allow sub if we did a non-commutative match | ||
Constant *Identity = ConstantExpr::getIdentity(&BO, BO.getType()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've left the sub case for later, since I believe InstCombine canonicalizes |
||
if (!Identity || !Identity->isNullValue()) | ||
return false; | ||
|
||
using namespace PatternMatch; | ||
|
||
Value *Mask, *RHS; | ||
if (!match(&BO, m_c_BinOp(m_OneUse(m_ZExt(m_Value(Mask))), m_Value(RHS)))) | ||
lukel97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return false; | ||
|
||
if (!Mask->getType()->isIntOrIntVectorTy(1)) | ||
return false; | ||
|
||
IRBuilder<> Builder(&BO); | ||
Value *Splat = ConstantInt::get(BO.getType(), 1); | ||
Value *NewBO = Builder.CreateBinOp(BO.getOpcode(), RHS, Splat); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This requires a freeze on RHS There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I'm missing something here. We aren't increasing the number of uses of RHS and alive doesn't seem to complain. And we're not speculatively executing anything? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry yes, I missed that. I'm guessing the reason why alive doesn't require the freeze then is because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Freeze is unnecessary here because we only use one arm of the select inst. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
IIUC the freeze can be eliminated as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Related patch: https://reviews.llvm.org/D118644 |
||
if (Instruction *I = dyn_cast<Instruction>(NewBO)) | ||
I->copyIRFlags(&BO); | ||
Value *Select = Builder.CreateSelect(Mask, NewBO, RHS); | ||
|
||
BO.replaceAllUsesWith(Select); | ||
BO.eraseFromParent(); | ||
|
||
return true; | ||
} | ||
|
||
// Try to optimize (i64 (and (zext/sext (i32 X), C1))) if C1 has bit 31 set, | ||
// but bits 63:32 are zero. If we know that bit 31 of X is 0, we can fill | ||
// the upper 32 bits with ones. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -168,13 +168,14 @@ define <8 x i64> @vaaddu_vv_v8i64_floor(<8 x i64> %x, <8 x i64> %y) { | |
define <8 x i1> @vaaddu_vv_v8i1_floor(<8 x i1> %x, <8 x i1> %y) { | ||
; CHECK-LABEL: vaaddu_vv_v8i1_floor: | ||
; CHECK: # %bb.0: | ||
; CHECK-NEXT: vsetivli zero, 8, e8, mf2, ta, ma | ||
; CHECK-NEXT: vmv.v.i v9, 0 | ||
; CHECK-NEXT: vmerge.vim v10, v9, 1, v0 | ||
; CHECK-NEXT: vmv1r.v v9, v0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this an improvement? |
||
; CHECK-NEXT: vsetivli zero, 8, e8, mf2, ta, mu | ||
; CHECK-NEXT: vmv.v.i v10, 0 | ||
; CHECK-NEXT: vmv1r.v v0, v8 | ||
; CHECK-NEXT: vmerge.vim v8, v9, 1, v0 | ||
; CHECK-NEXT: csrwi vxrm, 2 | ||
; CHECK-NEXT: vaaddu.vv v8, v10, v8 | ||
; CHECK-NEXT: vmerge.vim v8, v10, 1, v0 | ||
; CHECK-NEXT: vmv1r.v v0, v9 | ||
; CHECK-NEXT: vadd.vi v8, v8, 1, v0.t | ||
; CHECK-NEXT: vsrl.vi v8, v8, 1 | ||
; CHECK-NEXT: vand.vi v8, v8, 1 | ||
; CHECK-NEXT: vmsne.vi v0, v8, 0 | ||
; CHECK-NEXT: ret | ||
|
@@ -421,13 +422,16 @@ define <8 x i64> @vaaddu_vv_v8i64_ceil(<8 x i64> %x, <8 x i64> %y) { | |
define <8 x i1> @vaaddu_vv_v8i1_ceil(<8 x i1> %x, <8 x i1> %y) { | ||
; CHECK-LABEL: vaaddu_vv_v8i1_ceil: | ||
; CHECK: # %bb.0: | ||
; CHECK-NEXT: vsetivli zero, 8, e8, mf2, ta, ma | ||
; CHECK-NEXT: vmv.v.i v9, 0 | ||
; CHECK-NEXT: vmerge.vim v10, v9, 1, v0 | ||
; CHECK-NEXT: vmv1r.v v9, v0 | ||
; CHECK-NEXT: vsetivli zero, 8, e8, mf2, ta, mu | ||
; CHECK-NEXT: vmv.v.i v10, 0 | ||
; CHECK-NEXT: vmv1r.v v0, v8 | ||
; CHECK-NEXT: vmerge.vim v8, v9, 1, v0 | ||
; CHECK-NEXT: csrwi vxrm, 0 | ||
; CHECK-NEXT: vaaddu.vv v8, v10, v8 | ||
; CHECK-NEXT: vmerge.vim v8, v10, 1, v0 | ||
; CHECK-NEXT: vmv1r.v v0, v9 | ||
; CHECK-NEXT: vadd.vi v8, v8, 1, v0.t | ||
; CHECK-NEXT: li a0, 1 | ||
; CHECK-NEXT: csrwi vxrm, 2 | ||
; CHECK-NEXT: vaaddu.vx v8, v8, a0 | ||
; CHECK-NEXT: vand.vi v8, v8, 1 | ||
; CHECK-NEXT: vmsne.vi v0, v8, 0 | ||
; CHECK-NEXT: ret | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be done on IR instead of SelectionDAG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just seemed like the right place to put it since it's undoing an InstCombine transform. I don't think there should be any issue doing it as a DAG combine. Would that be preferred?