Skip to content

[RISCV] Don't commute with shift if it would break sh{1,2,3}add pattern #119527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 6, 2025

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Dec 11, 2024

Stacked on #119526

This fixes a regression from #101294 by checking if we might be clobbering a sh{1,2,3}add pattern.

Only do this is the underlying add isn't going to be folded away into an address offset.

@llvmbot
Copy link
Member

llvmbot commented Dec 11, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

Stacked on #119526

This fixes a regression from #101294 by checking if we might be clobbering a sh{1,2,3}add pattern.

Only do this is the underlying add isn't going to be folded away into an address offset.


Patch is 22.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119527.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+18-30)
  • (modified) llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll (+232-79)
  • (modified) llvm/test/CodeGen/RISCV/add_shl_constant.ll (+192-59)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index c6838573637202..5b94ae087f11ae 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -18237,44 +18237,32 @@ bool RISCVTargetLowering::isDesirableToCommuteWithShift(
   // LD/ST will optimize constant Offset extraction, so when AddNode is used by
   // LD/ST, it can still complete the folding optimization operation performed
   // above.
-  auto isUsedByLdSt = [&]() {
-    bool CanOptAlways = false;
-    if (N0->getOpcode() == ISD::ADD && !N0->hasOneUse()) {
-      for (SDNode *Use : N0->uses()) {
-        // This use is the one we're on right now. Skip it
-        if (Use == N || Use->getOpcode() == ISD::SELECT)
-          continue;
-        if (!isa<StoreSDNode>(Use) && !isa<LoadSDNode>(Use)) {
-          CanOptAlways = false;
-          break;
-        }
-        CanOptAlways = true;
-      }
-    }
-
-    if (N0->getOpcode() == ISD::SIGN_EXTEND &&
-        !N0->getOperand(0)->hasOneUse()) {
-      for (SDNode *Use : N0->getOperand(0)->uses()) {
-        // This use is the one we're on right now. Skip it
-        if (Use == N0.getNode() || Use->getOpcode() == ISD::SELECT)
-          continue;
-        if (!isa<StoreSDNode>(Use) && !isa<LoadSDNode>(Use)) {
-          CanOptAlways = false;
-          break;
-        }
-        CanOptAlways = true;
-      }
+  auto isUsedByLdSt = [](const SDNode *X, const SDNode *User) {
+    for (SDNode *Use : X->uses()) {
+      // This use is the one we're on right now. Skip it
+      if (Use == User || Use->getOpcode() == ISD::SELECT)
+        continue;
+      if (!isa<StoreSDNode>(Use) && !isa<LoadSDNode>(Use))
+        return false;
     }
-    return CanOptAlways;
+    return true;
   };
 
   if (Ty.isScalarInteger() &&
       (N0.getOpcode() == ISD::ADD || N0.getOpcode() == ISD::OR)) {
     if (N0.getOpcode() == ISD::ADD && !N0->hasOneUse())
-      return isUsedByLdSt();
+      return isUsedByLdSt(N0.getNode(), N);
 
     auto *C1 = dyn_cast<ConstantSDNode>(N0->getOperand(1));
     auto *C2 = dyn_cast<ConstantSDNode>(N->getOperand(1));
+
+    // Bail if we might break a sh{1,2,3}add pattern.
+    if (Subtarget.hasStdExtZba() && C2->getZExtValue() >= 1 &&
+        C2->getZExtValue() <= 3 && N->hasOneUse() &&
+        N->use_begin()->getOpcode() == ISD::ADD &&
+        !isUsedByLdSt(*N->use_begin(), nullptr))
+      return false;
+
     if (C1 && C2) {
       const APInt &C1Int = C1->getAPIntValue();
       APInt ShiftedC1Int = C1Int << C2->getAPIntValue();
@@ -18314,7 +18302,7 @@ bool RISCVTargetLowering::isDesirableToCommuteWithShift(
   if (N0->getOpcode() == ISD::SIGN_EXTEND &&
       N0->getOperand(0)->getOpcode() == ISD::ADD &&
       !N0->getOperand(0)->hasOneUse())
-    return isUsedByLdSt();
+    return isUsedByLdSt(N0->getOperand(0).getNode(), N0.getNode());
 
   return true;
 }
diff --git a/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll b/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll
index 47b6c07cc699e7..2f329fb9d83bfd 100644
--- a/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll
+++ b/llvm/test/CodeGen/RISCV/add_sext_shl_constant.ll
@@ -1,17 +1,28 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
-; RUN: llc -mtriple=riscv64 < %s | FileCheck -check-prefix=RV64 %s
+; RUN: llc -mtriple=riscv64 < %s | FileCheck -check-prefixes=RV64,NO-ZBA %s
+; RUN: llc -mtriple=riscv64 -mattr=+zba < %s | FileCheck -check-prefixes=RV64,ZBA %s
 
 define void @add_sext_shl_moreOneUse_add(ptr %array1, i32 %a, i32 %b) {
-; RV64-LABEL: add_sext_shl_moreOneUse_add:
-; RV64:       # %bb.0: # %entry
-; RV64-NEXT:    addi a3, a1, 5
-; RV64-NEXT:    sext.w a1, a1
-; RV64-NEXT:    slli a1, a1, 2
-; RV64-NEXT:    add a0, a1, a0
-; RV64-NEXT:    sw a2, 20(a0)
-; RV64-NEXT:    sw a2, 24(a0)
-; RV64-NEXT:    sw a3, 140(a0)
-; RV64-NEXT:    ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_add:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a3, a1, 5
+; NO-ZBA-NEXT:    sext.w a1, a1
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    add a0, a1, a0
+; NO-ZBA-NEXT:    sw a2, 20(a0)
+; NO-ZBA-NEXT:    sw a2, 24(a0)
+; NO-ZBA-NEXT:    sw a3, 140(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_add:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a3, a1, 5
+; ZBA-NEXT:    sext.w a1, a1
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    sw a2, 20(a0)
+; ZBA-NEXT:    sw a2, 24(a0)
+; ZBA-NEXT:    sw a3, 140(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 5
   %idxprom = sext i32 %add to i64
@@ -29,19 +40,32 @@ entry:
 }
 
 define void @add_sext_shl_moreOneUse_addexceedsign12(ptr %array1, i32 %a, i32 %b) {
-; RV64-LABEL: add_sext_shl_moreOneUse_addexceedsign12:
-; RV64:       # %bb.0: # %entry
-; RV64-NEXT:    addi a3, a1, 2047
-; RV64-NEXT:    lui a4, 2
-; RV64-NEXT:    sext.w a1, a1
-; RV64-NEXT:    addi a3, a3, 1
-; RV64-NEXT:    slli a1, a1, 2
-; RV64-NEXT:    add a0, a0, a4
-; RV64-NEXT:    add a0, a0, a1
-; RV64-NEXT:    sw a2, 0(a0)
-; RV64-NEXT:    sw a3, 4(a0)
-; RV64-NEXT:    sw a2, 120(a0)
-; RV64-NEXT:    ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_addexceedsign12:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a3, a1, 2047
+; NO-ZBA-NEXT:    lui a4, 2
+; NO-ZBA-NEXT:    sext.w a1, a1
+; NO-ZBA-NEXT:    addi a3, a3, 1
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    add a0, a0, a4
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    sw a2, 0(a0)
+; NO-ZBA-NEXT:    sw a3, 4(a0)
+; NO-ZBA-NEXT:    sw a2, 120(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_addexceedsign12:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a3, a1, 2047
+; ZBA-NEXT:    lui a4, 2
+; ZBA-NEXT:    sext.w a1, a1
+; ZBA-NEXT:    addi a3, a3, 1
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    add a0, a0, a4
+; ZBA-NEXT:    sw a2, 0(a0)
+; ZBA-NEXT:    sw a3, 4(a0)
+; ZBA-NEXT:    sw a2, 120(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 2048
   %idxprom = sext i32 %add to i64
@@ -57,16 +81,26 @@ entry:
 }
 
 define void @add_sext_shl_moreOneUse_sext(ptr %array1, i32 %a, i32 %b) {
-; RV64-LABEL: add_sext_shl_moreOneUse_sext:
-; RV64:       # %bb.0: # %entry
-; RV64-NEXT:    sext.w a1, a1
-; RV64-NEXT:    addi a3, a1, 5
-; RV64-NEXT:    slli a1, a1, 2
-; RV64-NEXT:    add a0, a1, a0
-; RV64-NEXT:    sw a2, 20(a0)
-; RV64-NEXT:    sw a2, 24(a0)
-; RV64-NEXT:    sd a3, 140(a0)
-; RV64-NEXT:    ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_sext:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    sext.w a1, a1
+; NO-ZBA-NEXT:    addi a3, a1, 5
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    add a0, a1, a0
+; NO-ZBA-NEXT:    sw a2, 20(a0)
+; NO-ZBA-NEXT:    sw a2, 24(a0)
+; NO-ZBA-NEXT:    sd a3, 140(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_sext:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    sext.w a1, a1
+; ZBA-NEXT:    addi a3, a1, 5
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    sw a2, 20(a0)
+; ZBA-NEXT:    sw a2, 24(a0)
+; ZBA-NEXT:    sd a3, 140(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 5
   %idxprom = sext i32 %add to i64
@@ -85,20 +119,34 @@ entry:
 
 ; test of jumpping, find add's operand has one more use can simplified
 define void @add_sext_shl_moreOneUse_add_inSelect(ptr %array1, i32 signext  %a, i32 %b, i32 signext %x) {
-; RV64-LABEL: add_sext_shl_moreOneUse_add_inSelect:
-; RV64:       # %bb.0: # %entry
-; RV64-NEXT:    addi a4, a1, 5
-; RV64-NEXT:    mv a5, a4
-; RV64-NEXT:    bgtz a3, .LBB3_2
-; RV64-NEXT:  # %bb.1: # %entry
-; RV64-NEXT:    mv a5, a2
-; RV64-NEXT:  .LBB3_2: # %entry
-; RV64-NEXT:    slli a1, a1, 2
-; RV64-NEXT:    add a0, a1, a0
-; RV64-NEXT:    sw a5, 20(a0)
-; RV64-NEXT:    sw a5, 24(a0)
-; RV64-NEXT:    sw a4, 140(a0)
-; RV64-NEXT:    ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a4, a1, 5
+; NO-ZBA-NEXT:    mv a5, a4
+; NO-ZBA-NEXT:    bgtz a3, .LBB3_2
+; NO-ZBA-NEXT:  # %bb.1: # %entry
+; NO-ZBA-NEXT:    mv a5, a2
+; NO-ZBA-NEXT:  .LBB3_2: # %entry
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    add a0, a1, a0
+; NO-ZBA-NEXT:    sw a5, 20(a0)
+; NO-ZBA-NEXT:    sw a5, 24(a0)
+; NO-ZBA-NEXT:    sw a4, 140(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a4, a1, 5
+; ZBA-NEXT:    mv a5, a4
+; ZBA-NEXT:    bgtz a3, .LBB3_2
+; ZBA-NEXT:  # %bb.1: # %entry
+; ZBA-NEXT:    mv a5, a2
+; ZBA-NEXT:  .LBB3_2: # %entry
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    sw a5, 20(a0)
+; ZBA-NEXT:    sw a5, 24(a0)
+; ZBA-NEXT:    sw a4, 140(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 5
   %cmp = icmp sgt i32 %x, 0
@@ -118,23 +166,40 @@ entry:
 }
 
 define void @add_sext_shl_moreOneUse_add_inSelect_addexceedsign12(ptr %array1, i32 signext  %a, i32 %b, i32 signext %x) {
-; RV64-LABEL: add_sext_shl_moreOneUse_add_inSelect_addexceedsign12:
-; RV64:       # %bb.0: # %entry
-; RV64-NEXT:    addi a4, a1, 2047
-; RV64-NEXT:    lui a5, 2
-; RV64-NEXT:    slli a6, a1, 2
-; RV64-NEXT:    addi a1, a4, 1
-; RV64-NEXT:    add a0, a0, a6
-; RV64-NEXT:    add a0, a0, a5
-; RV64-NEXT:    mv a4, a1
-; RV64-NEXT:    bgtz a3, .LBB4_2
-; RV64-NEXT:  # %bb.1: # %entry
-; RV64-NEXT:    mv a4, a2
-; RV64-NEXT:  .LBB4_2: # %entry
-; RV64-NEXT:    sw a4, 0(a0)
-; RV64-NEXT:    sw a4, 4(a0)
-; RV64-NEXT:    sw a1, 120(a0)
-; RV64-NEXT:    ret
+; NO-ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect_addexceedsign12:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a4, a1, 2047
+; NO-ZBA-NEXT:    lui a5, 2
+; NO-ZBA-NEXT:    slli a6, a1, 2
+; NO-ZBA-NEXT:    addi a1, a4, 1
+; NO-ZBA-NEXT:    add a0, a0, a6
+; NO-ZBA-NEXT:    add a0, a0, a5
+; NO-ZBA-NEXT:    mv a4, a1
+; NO-ZBA-NEXT:    bgtz a3, .LBB4_2
+; NO-ZBA-NEXT:  # %bb.1: # %entry
+; NO-ZBA-NEXT:    mv a4, a2
+; NO-ZBA-NEXT:  .LBB4_2: # %entry
+; NO-ZBA-NEXT:    sw a4, 0(a0)
+; NO-ZBA-NEXT:    sw a4, 4(a0)
+; NO-ZBA-NEXT:    sw a1, 120(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_sext_shl_moreOneUse_add_inSelect_addexceedsign12:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a4, a1, 2047
+; ZBA-NEXT:    lui a5, 2
+; ZBA-NEXT:    addi a4, a4, 1
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    add a0, a0, a5
+; ZBA-NEXT:    mv a1, a4
+; ZBA-NEXT:    bgtz a3, .LBB4_2
+; ZBA-NEXT:  # %bb.1: # %entry
+; ZBA-NEXT:    mv a1, a2
+; ZBA-NEXT:  .LBB4_2: # %entry
+; ZBA-NEXT:    sw a1, 0(a0)
+; ZBA-NEXT:    sw a1, 4(a0)
+; ZBA-NEXT:    sw a4, 120(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 2048
   %cmp = icmp sgt i32 %x, 0
@@ -152,20 +217,34 @@ entry:
 }
 
 define void @add_shl_moreOneUse_inSelect(ptr %array1, i64 %a, i64 %b, i64 %x) {
-; RV64-LABEL: add_shl_moreOneUse_inSelect:
-; RV64:       # %bb.0: # %entry
-; RV64-NEXT:    addi a4, a1, 5
-; RV64-NEXT:    mv a5, a4
-; RV64-NEXT:    bgtz a3, .LBB5_2
-; RV64-NEXT:  # %bb.1: # %entry
-; RV64-NEXT:    mv a5, a2
-; RV64-NEXT:  .LBB5_2: # %entry
-; RV64-NEXT:    slli a1, a1, 3
-; RV64-NEXT:    add a0, a1, a0
-; RV64-NEXT:    sd a5, 40(a0)
-; RV64-NEXT:    sd a5, 48(a0)
-; RV64-NEXT:    sd a4, 280(a0)
-; RV64-NEXT:    ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a4, a1, 5
+; NO-ZBA-NEXT:    mv a5, a4
+; NO-ZBA-NEXT:    bgtz a3, .LBB5_2
+; NO-ZBA-NEXT:  # %bb.1: # %entry
+; NO-ZBA-NEXT:    mv a5, a2
+; NO-ZBA-NEXT:  .LBB5_2: # %entry
+; NO-ZBA-NEXT:    slli a1, a1, 3
+; NO-ZBA-NEXT:    add a0, a1, a0
+; NO-ZBA-NEXT:    sd a5, 40(a0)
+; NO-ZBA-NEXT:    sd a5, 48(a0)
+; NO-ZBA-NEXT:    sd a4, 280(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a4, a1, 5
+; ZBA-NEXT:    mv a5, a4
+; ZBA-NEXT:    bgtz a3, .LBB5_2
+; ZBA-NEXT:  # %bb.1: # %entry
+; ZBA-NEXT:    mv a5, a2
+; ZBA-NEXT:  .LBB5_2: # %entry
+; ZBA-NEXT:    sh3add a0, a1, a0
+; ZBA-NEXT:    sd a5, 40(a0)
+; ZBA-NEXT:    sd a5, 48(a0)
+; ZBA-NEXT:    sd a4, 280(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i64 %a, 5
   %cmp = icmp sgt i64 %x, 0
@@ -180,3 +259,77 @@ entry:
   store i64 %add, ptr %arrayidx6
   ret void
 }
+
+define i64 @add_shl_moreOneUse_sh1add(i64 %x) {
+; NO-ZBA-LABEL: add_shl_moreOneUse_sh1add:
+; NO-ZBA:       # %bb.0:
+; NO-ZBA-NEXT:    ori a1, a0, 1
+; NO-ZBA-NEXT:    slli a0, a0, 1
+; NO-ZBA-NEXT:    ori a0, a0, 2
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_sh1add:
+; ZBA:       # %bb.0:
+; ZBA-NEXT:    ori a0, a0, 1
+; ZBA-NEXT:    sh1add a0, a0, a0
+; ZBA-NEXT:    ret
+  %or = or i64 %x, 1
+  %mul = shl i64 %or, 1
+  %add = add i64 %mul, %or
+  ret i64 %add
+}
+
+define i64 @add_shl_moreOneUse_sh2add(i64 %x) {
+; NO-ZBA-LABEL: add_shl_moreOneUse_sh2add:
+; NO-ZBA:       # %bb.0:
+; NO-ZBA-NEXT:    ori a1, a0, 1
+; NO-ZBA-NEXT:    slli a0, a0, 2
+; NO-ZBA-NEXT:    ori a0, a0, 4
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_sh2add:
+; ZBA:       # %bb.0:
+; ZBA-NEXT:    ori a0, a0, 1
+; ZBA-NEXT:    sh2add a0, a0, a0
+; ZBA-NEXT:    ret
+  %or = or i64 %x, 1
+  %mul = shl i64 %or, 2
+  %add = add i64 %mul, %or
+  ret i64 %add
+}
+
+define i64 @add_shl_moreOneUse_sh3add(i64 %x) {
+; NO-ZBA-LABEL: add_shl_moreOneUse_sh3add:
+; NO-ZBA:       # %bb.0:
+; NO-ZBA-NEXT:    ori a1, a0, 1
+; NO-ZBA-NEXT:    slli a0, a0, 3
+; NO-ZBA-NEXT:    ori a0, a0, 8
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_sh3add:
+; ZBA:       # %bb.0:
+; ZBA-NEXT:    ori a0, a0, 1
+; ZBA-NEXT:    sh3add a0, a0, a0
+; ZBA-NEXT:    ret
+  %or = or i64 %x, 1
+  %mul = shl i64 %or, 3
+  %add = add i64 %mul, %or
+  ret i64 %add
+}
+
+define i64 @add_shl_moreOneUse_sh4add(i64 %x) {
+; RV64-LABEL: add_shl_moreOneUse_sh4add:
+; RV64:       # %bb.0:
+; RV64-NEXT:    ori a1, a0, 1
+; RV64-NEXT:    slli a0, a0, 4
+; RV64-NEXT:    ori a0, a0, 16
+; RV64-NEXT:    add a0, a0, a1
+; RV64-NEXT:    ret
+  %or = or i64 %x, 1
+  %mul = shl i64 %or, 4
+  %add = add i64 %mul, %or
+  ret i64 %add
+}
diff --git a/llvm/test/CodeGen/RISCV/add_shl_constant.ll b/llvm/test/CodeGen/RISCV/add_shl_constant.ll
index 71b61868b8c844..a4da9e26836488 100644
--- a/llvm/test/CodeGen/RISCV/add_shl_constant.ll
+++ b/llvm/test/CodeGen/RISCV/add_shl_constant.ll
@@ -1,13 +1,20 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc -mtriple=riscv32  < %s | FileCheck -check-prefix=RV32 %s
+; RUN: llc -mtriple=riscv32  < %s | FileCheck -check-prefixes=RV32,NO-ZBA %s
+; RUN: llc -mtriple=riscv32 -mattr=+zba  < %s | FileCheck -check-prefixes=RV32,ZBA %s
 
 define i32 @add_shl_oneUse(i32 %x, i32 %y) nounwind {
-; RV32-LABEL: add_shl_oneUse:
-; RV32:       # %bb.0:
-; RV32-NEXT:    slli a0, a0, 3
-; RV32-NEXT:    add a0, a0, a1
-; RV32-NEXT:    addi a0, a0, 984
-; RV32-NEXT:    ret
+; NO-ZBA-LABEL: add_shl_oneUse:
+; NO-ZBA:       # %bb.0:
+; NO-ZBA-NEXT:    slli a0, a0, 3
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    addi a0, a0, 984
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_oneUse:
+; ZBA:       # %bb.0:
+; ZBA-NEXT:    addi a0, a0, 123
+; ZBA-NEXT:    sh3add a0, a0, a1
+; ZBA-NEXT:    ret
   %add.0 = add i32 %x, 123
   %shl = shl i32 %add.0, 3
   %add.1 = add i32 %shl, %y
@@ -15,15 +22,24 @@ define i32 @add_shl_oneUse(i32 %x, i32 %y) nounwind {
 }
 
 define void @add_shl_moreOneUse_inStore(ptr %array1, i32 %a, i32 %b)  {
-; RV32-LABEL: add_shl_moreOneUse_inStore:
-; RV32:       # %bb.0: # %entry
-; RV32-NEXT:    addi a3, a1, 5
-; RV32-NEXT:    slli a1, a1, 2
-; RV32-NEXT:    add a0, a0, a1
-; RV32-NEXT:    sw a2, 20(a0)
-; RV32-NEXT:    sw a2, 24(a0)
-; RV32-NEXT:    sw a3, 140(a0)
-; RV32-NEXT:    ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inStore:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a3, a1, 5
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    sw a2, 20(a0)
+; NO-ZBA-NEXT:    sw a2, 24(a0)
+; NO-ZBA-NEXT:    sw a3, 140(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inStore:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a3, a1, 5
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    sw a2, 20(a0)
+; ZBA-NEXT:    sw a2, 24(a0)
+; ZBA-NEXT:    sw a3, 140(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 5
   %arrayidx = getelementptr inbounds i32, ptr %array1, i32 %add
@@ -37,18 +53,30 @@ entry:
 }
 
 define void @add_shl_moreOneUse_inStore_addexceedsign12(ptr %array1, i32 %a, i32 %b)  {
-; RV32-LABEL: add_shl_moreOneUse_inStore_addexceedsign12:
-; RV32:       # %bb.0: # %entry
-; RV32-NEXT:    addi a3, a1, 2047
-; RV32-NEXT:    lui a4, 2
-; RV32-NEXT:    slli a1, a1, 2
-; RV32-NEXT:    addi a3, a3, 1
-; RV32-NEXT:    add a0, a0, a1
-; RV32-NEXT:    add a0, a0, a4
-; RV32-NEXT:    sw a2, 0(a0)
-; RV32-NEXT:    sw a3, 4(a0)
-; RV32-NEXT:    sw a2, 120(a0)
-; RV32-NEXT:    ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inStore_addexceedsign12:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a3, a1, 2047
+; NO-ZBA-NEXT:    lui a4, 2
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    addi a3, a3, 1
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    add a0, a0, a4
+; NO-ZBA-NEXT:    sw a2, 0(a0)
+; NO-ZBA-NEXT:    sw a3, 4(a0)
+; NO-ZBA-NEXT:    sw a2, 120(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inStore_addexceedsign12:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a3, a1, 2047
+; ZBA-NEXT:    lui a4, 2
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    addi a3, a3, 1
+; ZBA-NEXT:    add a0, a0, a4
+; ZBA-NEXT:    sw a2, 0(a0)
+; ZBA-NEXT:    sw a3, 4(a0)
+; ZBA-NEXT:    sw a2, 120(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 2048
   %arrayidx = getelementptr inbounds i32, ptr %array1, i32 %add
@@ -62,20 +90,34 @@ entry:
 }
 
 define void @add_shl_moreOneUse_inSelect(ptr %array1, i32 %a, i32 %b, i32 %x) {
-; RV32-LABEL: add_shl_moreOneUse_inSelect:
-; RV32:       # %bb.0: # %entry
-; RV32-NEXT:    addi a4, a1, 5
-; RV32-NEXT:    mv a5, a4
-; RV32-NEXT:    bgtz a3, .LBB3_2
-; RV32-NEXT:  # %bb.1: # %entry
-; RV32-NEXT:    mv a5, a2
-; RV32-NEXT:  .LBB3_2: # %entry
-; RV32-NEXT:    slli a1, a1, 2
-; RV32-NEXT:    add a0, a0, a1
-; RV32-NEXT:    sw a5, 20(a0)
-; RV32-NEXT:    sw a5, 24(a0)
-; RV32-NEXT:    sw a4, 140(a0)
-; RV32-NEXT:    ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a4, a1, 5
+; NO-ZBA-NEXT:    mv a5, a4
+; NO-ZBA-NEXT:    bgtz a3, .LBB3_2
+; NO-ZBA-NEXT:  # %bb.1: # %entry
+; NO-ZBA-NEXT:    mv a5, a2
+; NO-ZBA-NEXT:  .LBB3_2: # %entry
+; NO-ZBA-NEXT:    slli a1, a1, 2
+; NO-ZBA-NEXT:    add a0, a0, a1
+; NO-ZBA-NEXT:    sw a5, 20(a0)
+; NO-ZBA-NEXT:    sw a5, 24(a0)
+; NO-ZBA-NEXT:    sw a4, 140(a0)
+; NO-ZBA-NEXT:    ret
+;
+; ZBA-LABEL: add_shl_moreOneUse_inSelect:
+; ZBA:       # %bb.0: # %entry
+; ZBA-NEXT:    addi a4, a1, 5
+; ZBA-NEXT:    mv a5, a4
+; ZBA-NEXT:    bgtz a3, .LBB3_2
+; ZBA-NEXT:  # %bb.1: # %entry
+; ZBA-NEXT:    mv a5, a2
+; ZBA-NEXT:  .LBB3_2: # %entry
+; ZBA-NEXT:    sh2add a0, a1, a0
+; ZBA-NEXT:    sw a5, 20(a0)
+; ZBA-NEXT:    sw a5, 24(a0)
+; ZBA-NEXT:    sw a4, 140(a0)
+; ZBA-NEXT:    ret
 entry:
   %add = add nsw i32 %a, 5
   %cmp = icmp sgt i32 %x, 0
@@ -91,23 +133,40 @@ entry:
 }
 
 define void @add_shl_moreOneUse_inSelect_addexceedsign12(ptr %array1, i32 %a, i32 %b, i32 %x) {
-; RV32-LABEL: add_shl_moreOneUse_inSelect_addexceedsign12:
-; RV32:       # %bb.0: # %entry
-; RV32-NEXT:    addi a4, a1, 2047
-; RV32-NEXT:    addi a4, a4, 1
-; RV32-NEXT:    mv a5, a4
-; RV32-NEXT:    bgtz a3, .LBB4_2
-; RV32-NEXT:  # %bb.1: # %entry
-; RV32-NEXT:    mv a5, a2
-; RV32-NEXT:  .LBB4_2: # %entry
-; RV32-NEXT:    lui a2, 2
-; RV32-NEXT:    slli a1, a1, 2
-; RV32-NEXT:    add a0, a0, a1
-; RV32-NEXT:    add a0, a0, a2
-; RV32-NEXT:    sw a5, 0(a0)
-; RV32-NEXT:    sw a5, 4(a0)
-; RV32-NEXT:    sw a4, 120(a0)
-; RV32-NEXT:    ret
+; NO-ZBA-LABEL: add_shl_moreOneUse_inSelect_addexceedsign12:
+; NO-ZBA:       # %bb.0: # %entry
+; NO-ZBA-NEXT:    addi a4, a1, 2047
+; NO-ZBA-NEXT:    addi a4, a4, 1
+; NO-ZBA-NEXT:    mv a5, a4
+; NO-ZBA-NEXT:    bgtz a3, .LBB4_2
+; NO-ZBA-NEXT:  # %bb.1: # %entry
+; NO-ZBA-NEXT:    mv a5, a2
+; NO-ZBA-NEXT:  .LBB4_2: # %entry
+; NO-ZB...
[truncated]

dtcxzyw added a commit to dtcxzyw/llvm-codegen-benchmark that referenced this pull request Dec 12, 2024
lukel97 added a commit that referenced this pull request Dec 12, 2024
Instead of duplicating the loop twice, add arguments to the lambda.
I plan on reusing this in #119527
@lukel97 lukel97 force-pushed the isDesirableToCommuteWithShift-zba branch from 7a2b9a9 to ad62250 Compare December 12, 2024 05:26
@lukel97 lukel97 changed the title [RISCV] Don't commute with shift it would break sh{1,2,3}add pattern [RISCV] Don't commute with shift if it would break sh{1,2,3}add pattern Dec 12, 2024
dtcxzyw added a commit to dtcxzyw/llvm-codegen-benchmark that referenced this pull request Dec 12, 2024
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pre-commit the following test?

; bin/llc -mtriple=riscv64 test.ll -o -
; bin/llc -mtriple=riscv64 -mattr=+zba test.ll -o -
define i32 @test(i32 %0, i32 %1) nounwind {
entry:
  %2 = add i32 %1, 1
  %3 = add i32 %2, %0
  %4 = shl nuw nsw i32 %3, 3
  %5 = add nsw i32 %4, -8
  ret i32 %5
}

This regression may be fixed by checking if the RHS of the ADD is a constant.

TBH I think both #101294 and this patch are fragile...

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97
Copy link
Contributor Author

lukel97 commented Jan 6, 2025

Can you pre-commit the following test?

; bin/llc -mtriple=riscv64 test.ll -o -
; bin/llc -mtriple=riscv64 -mattr=+zba test.ll -o -
define i32 @test(i32 %0, i32 %1) nounwind {
entry:
  %2 = add i32 %1, 1
  %3 = add i32 %2, %0
  %4 = shl nuw nsw i32 %3, 3
  %5 = add nsw i32 %4, -8
  ret i32 %5
}

This regression may be fixed by checking if the RHS of the ADD is a constant.

Thanks for catching this, should be fixed now

dtcxzyw added a commit to dtcxzyw/llvm-codegen-benchmark that referenced this pull request Jan 6, 2025
@dtcxzyw
Copy link
Member

dtcxzyw commented Jan 6, 2025

Can you rebase on the top of #120531?

FAILED: lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o 
ccache /usr/bin/c++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-build/lib/Target/RISCV -I/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV -I/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-build/include -I/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -w -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o -MF lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o.d -o lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o -c /data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp: In member function ‘virtual bool llvm::RISCVTargetLowering::isDesirableToCommuteWithShift(const llvm::SDNode*, llvm::CombineLevel) const’:
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18393: error: ‘class llvm::SDUse’ has no member named ‘getOpcode’; did you mean ‘getNode’?
18393 |         N->use_begin()->getOpcode() == ISD::ADD &&
      | 
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18394: error: no match for call to ‘(llvm::RISCVTargetLowering::isDesirableToCommuteWithShift(const llvm::SDNode*, llvm::CombineLevel) const::<lambda(const llvm::SDNode*, const llvm::SDNode*)>) (llvm::SDUse&, std::nullptr_t)’
18394 |         !isUsedByLdSt(*N->use_begin(), nullptr) &&
      | 
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18394: note: candidate: ‘bool (*)(const llvm::SDNode*, const llvm::SDNode*)’ (conversion)
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18394: note:   candidate expects 3 arguments, 3 provided
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18371: note: candidate: ‘llvm::RISCVTargetLowering::isDesirableToCommuteWithShift(const llvm::SDNode*, llvm::CombineLevel) const::<lambda(const llvm::SDNode*, const llvm::SDNode*)>’
18371 |   auto isUsedByLdSt = [](const SDNode *X, const SDNode *User) {
      | 
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18371: note:   no known conversion for argument 1 from ‘llvm::SDUse’ to ‘const llvm::SDNode*’
/data/zyw/codegen/actions-runner/_work/llvm-codegen-benchmark/llvm-codegen-benchmark/llvm/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:18395: error: ‘class llvm::SDUse’ has no member named ‘getOperand’; did you mean ‘getOperandNo’?
18395 |         !isa<ConstantSDNode>(N->use_begin()->getOperand(1)))
      | 
ninja: build stopped: subcommand failed.

Stacked on llvm#119526

This fixes a regression from llvm#101294 by checking if we might be clobbering a sh{1,2,3}add pattern.

Only do this is the underlying add isn't going to be folded away into an address offset.
@lukel97 lukel97 force-pushed the isDesirableToCommuteWithShift-zba branch from 3e04d32 to 4eaed56 Compare January 6, 2025 10:32
@lukel97
Copy link
Contributor Author

lukel97 commented Jan 6, 2025

Can you rebase on the top of #120531?

Should be rebased now, thanks for checking the codegen.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@lukel97 lukel97 merged commit b359c84 into llvm:main Jan 6, 2025
6 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 6, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-rel-x86-64 running on ml-opt-rel-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/10952

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s' FAILED ********************
Exit Code: 134

Command Output (stderr):
--
RUN: at line 1: /b/ml-opt-rel-x86-64-b1/build/bin/llvm-mc -triple=x86_64-unknown-linux -position-independent      -filetype=obj -o /b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.tmp.o /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s
+ /b/ml-opt-rel-x86-64-b1/build/bin/llvm-mc -triple=x86_64-unknown-linux -position-independent -filetype=obj -o /b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.tmp.o /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s
RUN: at line 3: /b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink -noexec -abs X=0x1234 -check=/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s /b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.tmp.o
+ /b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink -noexec -abs X=0x1234 -check=/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s /b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.tmp.o
llvm-jitlink: /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h:285: llvm::orc::SymbolStringPool::~SymbolStringPool(): Assertion `Pool.empty() && "Dangling references at pool destruction time"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink -noexec -abs X=0x1234 -check=/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s /b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.tmp.o
 #0 0x000056179828e4d8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink+0x11434d8)
 #1 0x000056179828b8bc SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fc480b00140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x13140)
 #3 0x00007fc4805f7d51 raise (/lib/x86_64-linux-gnu/libc.so.6+0x38d51)
 #4 0x00007fc4805e1537 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22537)
 #5 0x00007fc4805e140f (/lib/x86_64-linux-gnu/libc.so.6+0x2240f)
 #6 0x00007fc4805f06d2 (/lib/x86_64-linux-gnu/libc.so.6+0x316d2)
 #7 0x000056179788ddff (/b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink+0x742dff)
 #8 0x000056179813a63a llvm::orc::ExecutorProcessControl::~ExecutorProcessControl() (/b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink+0xfef63a)
 #9 0x000056179813a863 llvm::orc::SelfExecutorProcessControl::~SelfExecutorProcessControl() crtstuff.c:0:0
#10 0x0000561798058f66 llvm::orc::ExecutionSession::~ExecutionSession() (/b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink+0xf0df66)
#11 0x000056179783f34e main (/b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink+0x6f434e)
#12 0x00007fc4805e2d7a __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d7a)
#13 0x000056179787d47a _start (/b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink+0x73247a)
/b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.script: line 3: 4007321 Aborted                 /b/ml-opt-rel-x86-64-b1/build/bin/llvm-jitlink -noexec -abs X=0x1234 -check=/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/ELF_R_X86_64_16.s /b/ml-opt-rel-x86-64-b1/build/test/ExecutionEngine/JITLink/x86-64/Output/ELF_R_X86_64_16.s.tmp.o

--

********************


paulhuggett pushed a commit to paulhuggett/llvm-project that referenced this pull request Jan 7, 2025
…rn (llvm#119527)

This fixes a regression from llvm#101294 by checking if we might be
clobbering a sh{1,2,3}add pattern.

Only do this is the underlying add isn't going to be folded away into an
address offset.
@kongy
Copy link
Collaborator

kongy commented Jan 7, 2025

This change caused a regression when building clang_rt:

Stack dump:
0.	Program arguments: /tmpfs/src/git/out/stage2-install/bin/clang --target=riscv64-linux-android35 --sysroot=/tmpfs/src/git/out/sysroots/ndk/riscv64 -DVISIBILITY_HIDDEN -ffile-prefix-map=/tmpfs/src/git/= --target=riscv64-linux-android35 -ffunction-sections -fdata-sections -isystem /tmpfs/src/git/out/sysroots/ndk/riscv64/usr/include/riscv64-linux-android -D_LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT=1 --sysroot=/tmpfs/src/git/out/sysroots/ndk/riscv64 -O3 -DNDEBUG -fPIC -fno-lto -std=c11 -fPIC -fno-builtin -fvisibility=hidden -fomit-frame-pointer -DCOMPILER_RT_HAS_FLOAT16 -MD -MT CMakeFiles/clang_rt.builtins-riscv64.dir/truncsfhf2.c.o -MF CMakeFiles/clang_rt.builtins-riscv64.dir/truncsfhf2.c.o.d -o CMakeFiles/clang_rt.builtins-riscv64.dir/truncsfhf2.c.o -c /tmpfs/src/git/out/llvm-project/compiler-rt/lib/builtins/truncsfhf2.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '/tmpfs/src/git/out/llvm-project/compiler-rt/lib/builtins/truncsfhf2.c'.
4.	Running pass 'RISC-V DAG->DAG Pattern Instruction Selection' on function '@__truncsfhf2'
 #0 0x000059e73a5b34e8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/tmpfs/src/git/out/stage2-install/bin/clang+0x393d4e8)
 #1 0x000059e73a5b122e llvm::sys::RunSignalHandlers() (/tmpfs/src/git/out/stage2-install/bin/clang+0x393b22e)
 #2 0x000059e73a5b298f llvm::sys::CleanupOnSignal(unsigned long) (/tmpfs/src/git/out/stage2-install/bin/clang+0x393c98f)
 #3 0x000059e73a5408f9 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #4 0x000071314ff40520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x000059e7396d0828 llvm::RISCVTargetLowering::isDesirableToCommuteWithShift(llvm::SDNode const*, llvm::CombineLevel) const RISCVISelLowering.cpp:0:0
 #6 0x000059e73b2e2f3c (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
 #7 0x000059e73b2ce5f0 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
 #8 0x000059e73b2cd41d llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/tmpfs/src/git/out/stage2-install/bin/clang+0x465741d)
 #9 0x000059e73b4402ff llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/tmpfs/src/git/out/stage2-install/bin/clang+0x47ca2ff)
#10 0x000059e73b43ff31 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/tmpfs/src/git/out/stage2-install/bin/clang+0x47c9f31)
#11 0x000059e73b43dca2 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/tmpfs/src/git/out/stage2-install/bin/clang+0x47c7ca2)
#12 0x000059e73b43c5ca llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/tmpfs/src/git/out/stage2-install/bin/clang+0x47c65ca)
#13 0x000059e739e21f9d llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/tmpfs/src/git/out/stage2-install/bin/clang+0x31abf9d)

Let me know if you need more information for debugging

@lukel97
Copy link
Contributor Author

lukel97 commented Jan 7, 2025

This change caused a regression when building clang_rt:

The fix should be in #121816, hopefully it will land soon. Sorry about that!

@zmodem
Copy link
Collaborator

zmodem commented Jan 7, 2025

We're hitting this too (https://crbug.com/388039781). Is there an ETA for landing the fix? If it will take a while, can we revert the breaking change in the meantime?

@lukel97
Copy link
Contributor Author

lukel97 commented Jan 7, 2025

We're hitting this too (https://crbug.com/388039781). Is there an ETA for landing the fix? If it will take a while, can we revert the breaking change in the meantime?

In the interest of keeping main green I've gone ahead and cherry-picked the fix ahead of the PR in b0e05a5, let me know if that fixes things.

@zmodem
Copy link
Collaborator

zmodem commented Jan 7, 2025

We're hitting this too (https://crbug.com/388039781). Is there an ETA for landing the fix? If it will take a while, can we revert the breaking change in the meantime?

In the interest of keeping main green I've gone ahead and cherry-picked the fix ahead of the PR in b0e05a5, let me know if that fixes things.

Confirmed that fixes the case I was hitting locally. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants