-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV] Don't commute with shift if it would break sh{1,2,3}add pattern #119527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesStacked on #119526 This fixes a regression from #101294 by checking if we might be clobbering a sh{1,2,3}add pattern. Only do this is the underlying add isn't going to be folded away into an address offset. Patch is 22.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119527.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index c6838573637202..5b94ae087f11ae 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -18237,44 +18237,32 @@ bool RISCVTargetLowering::isDesirableToCommuteWithShift(
// LD/ST will optimize constant Offset extraction, so when AddNode is used by
// LD/ST, it can still complete the folding optimization operation performed
// above.
- auto isUsedByLdSt = [&]() {
- bool CanOptAlways = false;
- if (N0->getOpcode() == ISD::ADD && !N0->hasOneUse()) {
- for (SDNode *Use : N0->uses()) {
- // This use is the one we're on right now. Skip it
- if (Use == N || Use->getOpcode() == ISD::SELECT)
- continue;
- if (!isa<StoreSDNode>(Use) && !isa<LoadSDNode>(Use)) {
- CanOptAlways = false;
- break;
- }
- CanOptAlways = true;
- }
- }
-
- if (N0->getOpcode() == ISD::SIGN_EXTEND &&
- !N0->getOperand(0)->hasOneUse()) {
- for (SDNode *Use : N0->getOperand(0)->uses()) {
- // This use is the one we're on right now. Skip it
- if (Use == N0.getNode() || Use->getOpcode() == ISD::SELECT)
- continue;
- if (!isa<StoreSDNode>(Use) && !isa<LoadSDNode>(Use)) {
- CanOptAlways = false;
- break;
- }
- CanOptAlways = true;
- }
+ auto isUsedByLdSt = [](const SDNode *X, const SDNode *User) {
+ for (SDNode *Use : X->uses()) {
+ // This use is the one we're on right now. Skip it
+ if (Use == User || Use->getOpcode() == ISD::SELECT)
+ continue;
+ if (!isa<StoreSDNode>(Use) && !isa<LoadSDNode>(Use))
+ return false;
}
- return CanOptAlways;
+ return true;
};
if (Ty.isScalarInteger() &&
(N0.getOpcode() == ISD::ADD || N0.getOpcode() == ISD::OR)) {
if (N0.getOpcode() == ISD::ADD && !N0->hasOneUse())
- return isUsedByLdSt();
+ return isUsedByLdSt(N0.getNode(), N);
auto *C1 = dyn_cast<ConstantSDNode>(N0->getOperand(1));
auto *C2 = dyn_cast<ConstantSDNode>(N->getOperand(1));
+
+ // Bail if we might break a sh{1,2,3}add pattern.
+ if (Subtarget.hasStdExtZba() && C2->getZExtValue() >= 1 &&
+ C2->getZExtValue() <= 3 && N->hasOneUse() &&
+ N->use_begin()->getOpcode() == ISD::ADD &&
+ !isUsedByLdSt(*N->use_begin(), nullptr))
+ return false;
+
if (C1 && C2) {
const APInt &C1Int = C1->getAPIntValue();
APInt ShiftedC1Int = C1Int << C2->getAPIntValue();
@@ -18314,7 +18302,7 @@ bool RISCVTargetLowering::isDesirableToCommuteWithShift(
if (N0->getOpcode() == ISD::SIGN_EXTEND &&
N0->getOperand(0)->getOpcode() == ISD::ADD &&
!N0->getOperand(0)->hasOneUse())
- return isUsedByLdSt();
+ return isUsedByLdSt(N0->getOperand(0).getNode(), N0.getNode());
return true;
}
diff --git a/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll b/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll
index 47b6c07cc699e7..2f329fb9d83bfd 100644
--- a/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll
+++ b/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll
@@ -1,17 +1,28 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
-; RUN: llc -mtriple=riscv64 < %s | FileCheck -check-prefix=RV64 %s
+; RUN: llc -mtriple=riscv64 < %s | FileCheck -check-prefixes=RV64,NO-ZBA %s
+; RUN: llc -mtriple=riscv64 -mattr=+zba < %s | FileCheck -check-prefixes=RV64,ZBA %s
define void @add_sext_shl_moreOneUse_add(ptr %array1, i32 %a, i32 %b) {
-; RV64-LABEL: add_sext_shl_moreOneUse_add:
-; RV64: # %bb.0: # %entry
-; RV64-NEXT: addi a3, a1, 5
-; RV64-NEXT: sext.w a1, a1
-; RV64-NEXT: slli a1, a1, 2
-; RV64-NEXT: add a0, a1, a0
-; RV64-NEXT: sw a2, 20(a0)
-; RV64-NEXT: sw a2, 24(a0)
-; RV64-NEXT: sw a3, 140(a0)
-; RV64-NEXT: ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_add:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a3, a1, 5
+; NO-ZBA-NEXT: sext.w a1, a1
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: add a0, a1, a0
+; NO-ZBA-NEXT: sw a2, 20(a0)
+; NO-ZBA-NEXT: sw a2, 24(a0)
+; NO-ZBA-NEXT: sw a3, 140(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_add:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a3, a1, 5
+; ZBA-NEXT: sext.w a1, a1
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: sw a2, 20(a0)
+; ZBA-NEXT: sw a2, 24(a0)
+; ZBA-NEXT: sw a3, 140(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 5
%idxprom = sext i32 %add to i64
@@ -29,19 +40,32 @@ entry:
}
define void @add_sext_shl_moreOneUse_addexceedsign12(ptr %array1, i32 %a, i32 %b) {
-; RV64-LABEL: add_sext_shl_moreOneUse_addexceedsign12:
-; RV64: # %bb.0: # %entry
-; RV64-NEXT: addi a3, a1, 2047
-; RV64-NEXT: lui a4, 2
-; RV64-NEXT: sext.w a1, a1
-; RV64-NEXT: addi a3, a3, 1
-; RV64-NEXT: slli a1, a1, 2
-; RV64-NEXT: add a0, a0, a4
-; RV64-NEXT: add a0, a0, a1
-; RV64-NEXT: sw a2, 0(a0)
-; RV64-NEXT: sw a3, 4(a0)
-; RV64-NEXT: sw a2, 120(a0)
-; RV64-NEXT: ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_addexceedsign12:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a3, a1, 2047
+; NO-ZBA-NEXT: lui a4, 2
+; NO-ZBA-NEXT: sext.w a1, a1
+; NO-ZBA-NEXT: addi a3, a3, 1
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: add a0, a0, a4
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: sw a2, 0(a0)
+; NO-ZBA-NEXT: sw a3, 4(a0)
+; NO-ZBA-NEXT: sw a2, 120(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_addexceedsign12:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a3, a1, 2047
+; ZBA-NEXT: lui a4, 2
+; ZBA-NEXT: sext.w a1, a1
+; ZBA-NEXT: addi a3, a3, 1
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: add a0, a0, a4
+; ZBA-NEXT: sw a2, 0(a0)
+; ZBA-NEXT: sw a3, 4(a0)
+; ZBA-NEXT: sw a2, 120(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 2048
%idxprom = sext i32 %add to i64
@@ -57,16 +81,26 @@ entry:
}
define void @add_sext_shl_moreOneUse_sext(ptr %array1, i32 %a, i32 %b) {
-; RV64-LABEL: add_sext_shl_moreOneUse_sext:
-; RV64: # %bb.0: # %entry
-; RV64-NEXT: sext.w a1, a1
-; RV64-NEXT: addi a3, a1, 5
-; RV64-NEXT: slli a1, a1, 2
-; RV64-NEXT: add a0, a1, a0
-; RV64-NEXT: sw a2, 20(a0)
-; RV64-NEXT: sw a2, 24(a0)
-; RV64-NEXT: sd a3, 140(a0)
-; RV64-NEXT: ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_sext:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: sext.w a1, a1
+; NO-ZBA-NEXT: addi a3, a1, 5
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: add a0, a1, a0
+; NO-ZBA-NEXT: sw a2, 20(a0)
+; NO-ZBA-NEXT: sw a2, 24(a0)
+; NO-ZBA-NEXT: sd a3, 140(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_sext:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: sext.w a1, a1
+; ZBA-NEXT: addi a3, a1, 5
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: sw a2, 20(a0)
+; ZBA-NEXT: sw a2, 24(a0)
+; ZBA-NEXT: sd a3, 140(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 5
%idxprom = sext i32 %add to i64
@@ -85,20 +119,34 @@ entry:
; test of jumpping, find add's operand has one more use can simplified
define void @add_sext_shl_moreOneUse_add_inSelect(ptr %array1, i32 signext %a, i32 %b, i32 signext %x) {
-; RV64-LABEL: add_sext_shl_moreOneUse_add_inSelect:
-; RV64: # %bb.0: # %entry
-; RV64-NEXT: addi a4, a1, 5
-; RV64-NEXT: mv a5, a4
-; RV64-NEXT: bgtz a3, .LBB3_2
-; RV64-NEXT: # %bb.1: # %entry
-; RV64-NEXT: mv a5, a2
-; RV64-NEXT: .LBB3_2: # %entry
-; RV64-NEXT: slli a1, a1, 2
-; RV64-NEXT: add a0, a1, a0
-; RV64-NEXT: sw a5, 20(a0)
-; RV64-NEXT: sw a5, 24(a0)
-; RV64-NEXT: sw a4, 140(a0)
-; RV64-NEXT: ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a4, a1, 5
+; NO-ZBA-NEXT: mv a5, a4
+; NO-ZBA-NEXT: bgtz a3, .LBB3_2
+; NO-ZBA-NEXT: # %bb.1: # %entry
+; NO-ZBA-NEXT: mv a5, a2
+; NO-ZBA-NEXT: .LBB3_2: # %entry
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: add a0, a1, a0
+; NO-ZBA-NEXT: sw a5, 20(a0)
+; NO-ZBA-NEXT: sw a5, 24(a0)
+; NO-ZBA-NEXT: sw a4, 140(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a4, a1, 5
+; ZBA-NEXT: mv a5, a4
+; ZBA-NEXT: bgtz a3, .LBB3_2
+; ZBA-NEXT: # %bb.1: # %entry
+; ZBA-NEXT: mv a5, a2
+; ZBA-NEXT: .LBB3_2: # %entry
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: sw a5, 20(a0)
+; ZBA-NEXT: sw a5, 24(a0)
+; ZBA-NEXT: sw a4, 140(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 5
%cmp = icmp sgt i32 %x, 0
@@ -118,23 +166,40 @@ entry:
}
define void @add_sext_shl_moreOneUse_add_inSelect_addexceedsign12(ptr %array1, i32 signext %a, i32 %b, i32 signext %x) {
-; RV64-LABEL: add_sext_shl_moreOneUse_add_inSelect_addexceedsign12:
-; RV64: # %bb.0: # %entry
-; RV64-NEXT: addi a4, a1, 2047
-; RV64-NEXT: lui a5, 2
-; RV64-NEXT: slli a6, a1, 2
-; RV64-NEXT: addi a1, a4, 1
-; RV64-NEXT: add a0, a0, a6
-; RV64-NEXT: add a0, a0, a5
-; RV64-NEXT: mv a4, a1
-; RV64-NEXT: bgtz a3, .LBB4_2
-; RV64-NEXT: # %bb.1: # %entry
-; RV64-NEXT: mv a4, a2
-; RV64-NEXT: .LBB4_2: # %entry
-; RV64-NEXT: sw a4, 0(a0)
-; RV64-NEXT: sw a4, 4(a0)
-; RV64-NEXT: sw a1, 120(a0)
-; RV64-NEXT: ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect_addexceedsign12:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a4, a1, 2047
+; NO-ZBA-NEXT: lui a5, 2
+; NO-ZBA-NEXT: slli a6, a1, 2
+; NO-ZBA-NEXT: addi a1, a4, 1
+; NO-ZBA-NEXT: add a0, a0, a6
+; NO-ZBA-NEXT: add a0, a0, a5
+; NO-ZBA-NEXT: mv a4, a1
+; NO-ZBA-NEXT: bgtz a3, .LBB4_2
+; NO-ZBA-NEXT: # %bb.1: # %entry
+; NO-ZBA-NEXT: mv a4, a2
+; NO-ZBA-NEXT: .LBB4_2: # %entry
+; NO-ZBA-NEXT: sw a4, 0(a0)
+; NO-ZBA-NEXT: sw a4, 4(a0)
+; NO-ZBA-NEXT: sw a1, 120(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect_addexceedsign12:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a4, a1, 2047
+; ZBA-NEXT: lui a5, 2
+; ZBA-NEXT: addi a4, a4, 1
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: add a0, a0, a5
+; ZBA-NEXT: mv a1, a4
+; ZBA-NEXT: bgtz a3, .LBB4_2
+; ZBA-NEXT: # %bb.1: # %entry
+; ZBA-NEXT: mv a1, a2
+; ZBA-NEXT: .LBB4_2: # %entry
+; ZBA-NEXT: sw a1, 0(a0)
+; ZBA-NEXT: sw a1, 4(a0)
+; ZBA-NEXT: sw a4, 120(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 2048
%cmp = icmp sgt i32 %x, 0
@@ -152,20 +217,34 @@ entry:
}
define void @add_shl_moreOneUse_inSelect(ptr %array1, i64 %a, i64 %b, i64 %x) {
-; RV64-LABEL: add_shl_moreOneUse_inSelect:
-; RV64: # %bb.0: # %entry
-; RV64-NEXT: addi a4, a1, 5
-; RV64-NEXT: mv a5, a4
-; RV64-NEXT: bgtz a3, .LBB5_2
-; RV64-NEXT: # %bb.1: # %entry
-; RV64-NEXT: mv a5, a2
-; RV64-NEXT: .LBB5_2: # %entry
-; RV64-NEXT: slli a1, a1, 3
-; RV64-NEXT: add a0, a1, a0
-; RV64-NEXT: sd a5, 40(a0)
-; RV64-NEXT: sd a5, 48(a0)
-; RV64-NEXT: sd a4, 280(a0)
-; RV64-NEXT: ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a4, a1, 5
+; NO-ZBA-NEXT: mv a5, a4
+; NO-ZBA-NEXT: bgtz a3, .LBB5_2
+; NO-ZBA-NEXT: # %bb.1: # %entry
+; NO-ZBA-NEXT: mv a5, a2
+; NO-ZBA-NEXT: .LBB5_2: # %entry
+; NO-ZBA-NEXT: slli a1, a1, 3
+; NO-ZBA-NEXT: add a0, a1, a0
+; NO-ZBA-NEXT: sd a5, 40(a0)
+; NO-ZBA-NEXT: sd a5, 48(a0)
+; NO-ZBA-NEXT: sd a4, 280(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a4, a1, 5
+; ZBA-NEXT: mv a5, a4
+; ZBA-NEXT: bgtz a3, .LBB5_2
+; ZBA-NEXT: # %bb.1: # %entry
+; ZBA-NEXT: mv a5, a2
+; ZBA-NEXT: .LBB5_2: # %entry
+; ZBA-NEXT: sh3add a0, a1, a0
+; ZBA-NEXT: sd a5, 40(a0)
+; ZBA-NEXT: sd a5, 48(a0)
+; ZBA-NEXT: sd a4, 280(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i64 %a, 5
%cmp = icmp sgt i64 %x, 0
@@ -180,3 +259,77 @@ entry:
store i64 %add, ptr %arrayidx6
ret void
}
+
+define i64 @add_shl_moreOneUse_sh1add(i64 %x) {
+; NO-ZBA-LABEL: add_shl_moreOneUse_sh1add:
+; NO-ZBA: # %bb.0:
+; NO-ZBA-NEXT: ori a1, a0, 1
+; NO-ZBA-NEXT: slli a0, a0, 1
+; NO-ZBA-NEXT: ori a0, a0, 2
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_sh1add:
+; ZBA: # %bb.0:
+; ZBA-NEXT: ori a0, a0, 1
+; ZBA-NEXT: sh1add a0, a0, a0
+; ZBA-NEXT: ret
+ %or = or i64 %x, 1
+ %mul = shl i64 %or, 1
+ %add = add i64 %mul, %or
+ ret i64 %add
+}
+
+define i64 @add_shl_moreOneUse_sh2add(i64 %x) {
+; NO-ZBA-LABEL: add_shl_moreOneUse_sh2add:
+; NO-ZBA: # %bb.0:
+; NO-ZBA-NEXT: ori a1, a0, 1
+; NO-ZBA-NEXT: slli a0, a0, 2
+; NO-ZBA-NEXT: ori a0, a0, 4
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_sh2add:
+; ZBA: # %bb.0:
+; ZBA-NEXT: ori a0, a0, 1
+; ZBA-NEXT: sh2add a0, a0, a0
+; ZBA-NEXT: ret
+ %or = or i64 %x, 1
+ %mul = shl i64 %or, 2
+ %add = add i64 %mul, %or
+ ret i64 %add
+}
+
+define i64 @add_shl_moreOneUse_sh3add(i64 %x) {
+; NO-ZBA-LABEL: add_shl_moreOneUse_sh3add:
+; NO-ZBA: # %bb.0:
+; NO-ZBA-NEXT: ori a1, a0, 1
+; NO-ZBA-NEXT: slli a0, a0, 3
+; NO-ZBA-NEXT: ori a0, a0, 8
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_sh3add:
+; ZBA: # %bb.0:
+; ZBA-NEXT: ori a0, a0, 1
+; ZBA-NEXT: sh3add a0, a0, a0
+; ZBA-NEXT: ret
+ %or = or i64 %x, 1
+ %mul = shl i64 %or, 3
+ %add = add i64 %mul, %or
+ ret i64 %add
+}
+
+define i64 @add_shl_moreOneUse_sh4add(i64 %x) {
+; RV64-LABEL: add_shl_moreOneUse_sh4add:
+; RV64: # %bb.0:
+; RV64-NEXT: ori a1, a0, 1
+; RV64-NEXT: slli a0, a0, 4
+; RV64-NEXT: ori a0, a0, 16
+; RV64-NEXT: add a0, a0, a1
+; RV64-NEXT: ret
+ %or = or i64 %x, 1
+ %mul = shl i64 %or, 4
+ %add = add i64 %mul, %or
+ ret i64 %add
+}
diff --git a/llvm/test/CodeGen/RISCV/add_shl_constant.ll b/llvm/test/CodeGen/RISCV/add_shl_constant.ll
index 71b61868b8c844..a4da9e26836488 100644
--- a/llvm/test/CodeGen/RISCV/add_shl_constant.ll
+++ b/llvm/test/CodeGen/RISCV/add_shl_constant.ll
@@ -1,13 +1,20 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc -mtriple=riscv32 < %s | FileCheck -check-prefix=RV32 %s
+; RUN: llc -mtriple=riscv32 < %s | FileCheck -check-prefixes=RV32,NO-ZBA %s
+; RUN: llc -mtriple=riscv32 -mattr=+zba < %s | FileCheck -check-prefixes=RV32,ZBA %s
define i32 @add_shl_oneUse(i32 %x, i32 %y) nounwind {
-; RV32-LABEL: add_shl_oneUse:
-; RV32: # %bb.0:
-; RV32-NEXT: slli a0, a0, 3
-; RV32-NEXT: add a0, a0, a1
-; RV32-NEXT: addi a0, a0, 984
-; RV32-NEXT: ret
+; NO-ZBA-LABEL: add_shl_oneUse:
+; NO-ZBA: # %bb.0:
+; NO-ZBA-NEXT: slli a0, a0, 3
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: addi a0, a0, 984
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_oneUse:
+; ZBA: # %bb.0:
+; ZBA-NEXT: addi a0, a0, 123
+; ZBA-NEXT: sh3add a0, a0, a1
+; ZBA-NEXT: ret
%add.0 = add i32 %x, 123
%shl = shl i32 %add.0, 3
%add.1 = add i32 %shl, %y
@@ -15,15 +22,24 @@ define i32 @add_shl_oneUse(i32 %x, i32 %y) nounwind {
}
define void @add_shl_moreOneUse_inStore(ptr %array1, i32 %a, i32 %b) {
-; RV32-LABEL: add_shl_moreOneUse_inStore:
-; RV32: # %bb.0: # %entry
-; RV32-NEXT: addi a3, a1, 5
-; RV32-NEXT: slli a1, a1, 2
-; RV32-NEXT: add a0, a0, a1
-; RV32-NEXT: sw a2, 20(a0)
-; RV32-NEXT: sw a2, 24(a0)
-; RV32-NEXT: sw a3, 140(a0)
-; RV32-NEXT: ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inStore:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a3, a1, 5
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: sw a2, 20(a0)
+; NO-ZBA-NEXT: sw a2, 24(a0)
+; NO-ZBA-NEXT: sw a3, 140(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inStore:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a3, a1, 5
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: sw a2, 20(a0)
+; ZBA-NEXT: sw a2, 24(a0)
+; ZBA-NEXT: sw a3, 140(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 5
%arrayidx = getelementptr inbounds i32, ptr %array1, i32 %add
@@ -37,18 +53,30 @@ entry:
}
define void @add_shl_moreOneUse_inStore_addexceedsign12(ptr %array1, i32 %a, i32 %b) {
-; RV32-LABEL: add_shl_moreOneUse_inStore_addexceedsign12:
-; RV32: # %bb.0: # %entry
-; RV32-NEXT: addi a3, a1, 2047
-; RV32-NEXT: lui a4, 2
-; RV32-NEXT: slli a1, a1, 2
-; RV32-NEXT: addi a3, a3, 1
-; RV32-NEXT: add a0, a0, a1
-; RV32-NEXT: add a0, a0, a4
-; RV32-NEXT: sw a2, 0(a0)
-; RV32-NEXT: sw a3, 4(a0)
-; RV32-NEXT: sw a2, 120(a0)
-; RV32-NEXT: ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inStore_addexceedsign12:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a3, a1, 2047
+; NO-ZBA-NEXT: lui a4, 2
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: addi a3, a3, 1
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: add a0, a0, a4
+; NO-ZBA-NEXT: sw a2, 0(a0)
+; NO-ZBA-NEXT: sw a3, 4(a0)
+; NO-ZBA-NEXT: sw a2, 120(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inStore_addexceedsign12:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a3, a1, 2047
+; ZBA-NEXT: lui a4, 2
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: addi a3, a3, 1
+; ZBA-NEXT: add a0, a0, a4
+; ZBA-NEXT: sw a2, 0(a0)
+; ZBA-NEXT: sw a3, 4(a0)
+; ZBA-NEXT: sw a2, 120(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 2048
%arrayidx = getelementptr inbounds i32, ptr %array1, i32 %add
@@ -62,20 +90,34 @@ entry:
}
define void @add_shl_moreOneUse_inSelect(ptr %array1, i32 %a, i32 %b, i32 %x) {
-; RV32-LABEL: add_shl_moreOneUse_inSelect:
-; RV32: # %bb.0: # %entry
-; RV32-NEXT: addi a4, a1, 5
-; RV32-NEXT: mv a5, a4
-; RV32-NEXT: bgtz a3, .LBB3_2
-; RV32-NEXT: # %bb.1: # %entry
-; RV32-NEXT: mv a5, a2
-; RV32-NEXT: .LBB3_2: # %entry
-; RV32-NEXT: slli a1, a1, 2
-; RV32-NEXT: add a0, a0, a1
-; RV32-NEXT: sw a5, 20(a0)
-; RV32-NEXT: sw a5, 24(a0)
-; RV32-NEXT: sw a4, 140(a0)
-; RV32-NEXT: ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a4, a1, 5
+; NO-ZBA-NEXT: mv a5, a4
+; NO-ZBA-NEXT: bgtz a3, .LBB3_2
+; NO-ZBA-NEXT: # %bb.1: # %entry
+; NO-ZBA-NEXT: mv a5, a2
+; NO-ZBA-NEXT: .LBB3_2: # %entry
+; NO-ZBA-NEXT: slli a1, a1, 2
+; NO-ZBA-NEXT: add a0, a0, a1
+; NO-ZBA-NEXT: sw a5, 20(a0)
+; NO-ZBA-NEXT: sw a5, 24(a0)
+; NO-ZBA-NEXT: sw a4, 140(a0)
+; NO-ZBA-NEXT: ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; ZBA: # %bb.0: # %entry
+; ZBA-NEXT: addi a4, a1, 5
+; ZBA-NEXT: mv a5, a4
+; ZBA-NEXT: bgtz a3, .LBB3_2
+; ZBA-NEXT: # %bb.1: # %entry
+; ZBA-NEXT: mv a5, a2
+; ZBA-NEXT: .LBB3_2: # %entry
+; ZBA-NEXT: sh2add a0, a1, a0
+; ZBA-NEXT: sw a5, 20(a0)
+; ZBA-NEXT: sw a5, 24(a0)
+; ZBA-NEXT: sw a4, 140(a0)
+; ZBA-NEXT: ret
entry:
%add = add nsw i32 %a, 5
%cmp = icmp sgt i32 %x, 0
@@ -91,23 +133,40 @@ entry:
}
define void @add_shl_moreOneUse_inSelect_addexceedsign12(ptr %array1, i32 %a, i32 %b, i32 %x) {
-; RV32-LABEL: add_shl_moreOneUse_inSelect_addexceedsign12:
-; RV32: # %bb.0: # %entry
-; RV32-NEXT: addi a4, a1, 2047
-; RV32-NEXT: addi a4, a4, 1
-; RV32-NEXT: mv a5, a4
-; RV32-NEXT: bgtz a3, .LBB4_2
-; RV32-NEXT: # %bb.1: # %entry
-; RV32-NEXT: mv a5, a2
-; RV32-NEXT: .LBB4_2: # %entry
-; RV32-NEXT: lui a2, 2
-; RV32-NEXT: slli a1, a1, 2
-; RV32-NEXT: add a0, a0, a1
-; RV32-NEXT: add a0, a0, a2
-; RV32-NEXT: sw a5, 0(a0)
-; RV32-NEXT: sw a5, 4(a0)
-; RV32-NEXT: sw a4, 120(a0)
-; RV32-NEXT: ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inSelect_addexceedsign12:
+; NO-ZBA: # %bb.0: # %entry
+; NO-ZBA-NEXT: addi a4, a1, 2047
+; NO-ZBA-NEXT: addi a4, a4, 1
+; NO-ZBA-NEXT: mv a5, a4
+; NO-ZBA-NEXT: bgtz a3, .LBB4_2
+; NO-ZBA-NEXT: # %bb.1: # %entry
+; NO-ZBA-NEXT: mv a5, a2
+; NO-ZBA-NEXT: .LBB4_2: # %entry
+; NO-ZB...
[truncated]
|
Instead of duplicating the loop twice, add arguments to the lambda. I plan on reusing this in #119527
7a2b9a9
to
ad62250
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pre-commit the following test?
; bin/llc -mtriple=riscv64 test.ll -o -
; bin/llc -mtriple=riscv64 -mattr=+zba test.ll -o -
define i32 @test(i32 %0, i32 %1) nounwind {
entry:
%2 = add i32 %1, 1
%3 = add i32 %2, %0
%4 = shl nuw nsw i32 %3, 3
%5 = add nsw i32 %4, -8
ret i32 %5
}
This regression may be fixed by checking if the RHS of the ADD is a constant.
TBH I think both #101294 and this patch are fragile...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for catching this, should be fixed now |
Can you rebase on the top of #120531?
|
Stacked on llvm#119526 This fixes a regression from llvm#101294 by checking if we might be clobbering a sh{1,2,3}add pattern. Only do this is the underlying add isn't going to be folded away into an address offset.
3e04d32
to
4eaed56
Compare
Should be rebased now, thanks for checking the codegen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/10952 Here is the relevant piece of the build log for the reference
|
…rn (llvm#119527) This fixes a regression from llvm#101294 by checking if we might be clobbering a sh{1,2,3}add pattern. Only do this is the underlying add isn't going to be folded away into an address offset.
This change caused a regression when building clang_rt:
Let me know if you need more information for debugging |
The fix should be in #121816, hopefully it will land soon. Sorry about that! |
We're hitting this too (https://crbug.com/388039781). Is there an ETA for landing the fix? If it will take a while, can we revert the breaking change in the meantime? |
In the interest of keeping main green I've gone ahead and cherry-picked the fix ahead of the PR in b0e05a5, let me know if that fixes things. |
Confirmed that fixes the case I was hitting locally. Thanks! |
Stacked on #119526
This fixes a regression from #101294 by checking if we might be clobbering a sh{1,2,3}add pattern.
Only do this is the underlying add isn't going to be folded away into an address offset.