Skip to content

Conversation

@preames
Copy link
Collaborator

@preames preames commented Aug 27, 2025

This avoids the need to materialize the difference explicitly, and thus reduces register pressure when the condition val is otherwise unused.

@reviewers

  1. This is deliberately subset to be simple. In followups, I'm going to handle the swapped case, but that's a bit more complex and might best be done by refactoring some of the existing combines.

  2. We should arguably be doing this for all configurations, not just zicond.

This avoids the need to materialize the difference explicitly, and
thus reduces register pressure when the condition val is otherwise
unused.

@reviewers

1) This is deliberately subset to be simple.  In followups,
I'm going to handle the swapped case, but that's a bit more complex
and might best be done by refactoring some of the existing combines.

2) We should arguably be doing this for all configurations, not just
zicond.
@llvmbot
Copy link
Member

llvmbot commented Aug 27, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

This avoids the need to materialize the difference explicitly, and thus reduces register pressure when the condition val is otherwise unused.

@Reviewers

  1. This is deliberately subset to be simple. In followups, I'm going to handle the swapped case, but that's a bit more complex and might best be done by refactoring some of the existing combines.

  2. We should arguably be doing this for all configurations, not just zicond.


Full diff: https://github.com/llvm/llvm-project/pull/155644.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+8)
  • (modified) llvm/test/CodeGen/RISCV/select-const.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/select.ll (+45-37)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d1e413b378542..0837f6ba85d7c 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -9287,6 +9287,14 @@ SDValue RISCVTargetLowering::lowerSELECT(SDValue Op, SelectionDAG &DAG) const {
         }
       }
 
+      // Use SHL/ADDI to avoid having to materialize a constant in register
+      // TODO: Handle the inverse case when the condition can be cheaply flipped
+      if ((TrueVal - FalseVal).isPowerOf2() && FalseVal.isSignedIntN(12)) {
+        SDValue Log2 = DAG.getConstant((TrueVal - FalseVal).logBase2(), DL, VT);
+        SDValue BitDiff = DAG.getNode(ISD::SHL, DL, VT, CondV, Log2);
+        return DAG.getNode(ISD::ADD, DL, VT, FalseV, BitDiff);
+      }
+
       auto getCost = [&](const APInt &Delta, const APInt &Addend) {
         const int DeltaCost = RISCVMatInt::getIntMatCost(
             Delta, Subtarget.getXLen(), Subtarget, /*CompressionCost=*/true);
diff --git a/llvm/test/CodeGen/RISCV/select-const.ll b/llvm/test/CodeGen/RISCV/select-const.ll
index b734af5002403..e838710878d68 100644
--- a/llvm/test/CodeGen/RISCV/select-const.ll
+++ b/llvm/test/CodeGen/RISCV/select-const.ll
@@ -701,8 +701,8 @@ define i32 @diff_shl_addi2(i32 signext %x) {
 ; RV32ZICOND-LABEL: diff_shl_addi2:
 ; RV32ZICOND:       # %bb.0:
 ; RV32ZICOND-NEXT:    srli a0, a0, 31
-; RV32ZICOND-NEXT:    lui a1, 4
-; RV32ZICOND-NEXT:    czero.nez a0, a1, a0
+; RV32ZICOND-NEXT:    xori a0, a0, 1
+; RV32ZICOND-NEXT:    slli a0, a0, 14
 ; RV32ZICOND-NEXT:    addi a0, a0, 25
 ; RV32ZICOND-NEXT:    ret
 ;
@@ -731,8 +731,8 @@ define i32 @diff_shl_addi2(i32 signext %x) {
 ; RV64ZICOND-LABEL: diff_shl_addi2:
 ; RV64ZICOND:       # %bb.0:
 ; RV64ZICOND-NEXT:    srli a0, a0, 63
-; RV64ZICOND-NEXT:    lui a1, 4
-; RV64ZICOND-NEXT:    czero.nez a0, a1, a0
+; RV64ZICOND-NEXT:    xori a0, a0, 1
+; RV64ZICOND-NEXT:    slli a0, a0, 14
 ; RV64ZICOND-NEXT:    addiw a0, a0, 25
 ; RV64ZICOND-NEXT:    ret
   %cmp = icmp sgt i32 %x, -1
diff --git a/llvm/test/CodeGen/RISCV/select.ll b/llvm/test/CodeGen/RISCV/select.ll
index 2e1784d369680..c897dd9368248 100644
--- a/llvm/test/CodeGen/RISCV/select.ll
+++ b/llvm/test/CodeGen/RISCV/select.ll
@@ -1790,15 +1790,13 @@ define i32 @select_cst5_invert(i1 zeroext %cond) {
 ;
 ; RV64IMXVTCONDOPS-LABEL: select_cst5_invert:
 ; RV64IMXVTCONDOPS:       # %bb.0:
-; RV64IMXVTCONDOPS-NEXT:    li a1, 2
-; RV64IMXVTCONDOPS-NEXT:    vt.maskc a0, a1, a0
+; RV64IMXVTCONDOPS-NEXT:    slli a0, a0, 1
 ; RV64IMXVTCONDOPS-NEXT:    addi a0, a0, 2047
 ; RV64IMXVTCONDOPS-NEXT:    ret
 ;
 ; CHECKZICOND-LABEL: select_cst5_invert:
 ; CHECKZICOND:       # %bb.0:
-; CHECKZICOND-NEXT:    li a1, 2
-; CHECKZICOND-NEXT:    czero.eqz a0, a1, a0
+; CHECKZICOND-NEXT:    slli a0, a0, 1
 ; CHECKZICOND-NEXT:    addi a0, a0, 2047
 ; CHECKZICOND-NEXT:    ret
   %ret = select i1 %cond, i32 2049, i32 2047
@@ -1873,17 +1871,21 @@ define i32 @select_cst_diff2_invert(i1 zeroext %cond) {
 ;
 ; RV64IMXVTCONDOPS-LABEL: select_cst_diff2_invert:
 ; RV64IMXVTCONDOPS:       # %bb.0:
-; RV64IMXVTCONDOPS-NEXT:    li a1, -2
-; RV64IMXVTCONDOPS-NEXT:    vt.maskcn a0, a1, a0
-; RV64IMXVTCONDOPS-NEXT:    addi a0, a0, 122
+; RV64IMXVTCONDOPS-NEXT:    slli a0, a0, 1
+; RV64IMXVTCONDOPS-NEXT:    addiw a0, a0, 120
 ; RV64IMXVTCONDOPS-NEXT:    ret
 ;
-; CHECKZICOND-LABEL: select_cst_diff2_invert:
-; CHECKZICOND:       # %bb.0:
-; CHECKZICOND-NEXT:    li a1, -2
-; CHECKZICOND-NEXT:    czero.nez a0, a1, a0
-; CHECKZICOND-NEXT:    addi a0, a0, 122
-; CHECKZICOND-NEXT:    ret
+; RV32IMZICOND-LABEL: select_cst_diff2_invert:
+; RV32IMZICOND:       # %bb.0:
+; RV32IMZICOND-NEXT:    slli a0, a0, 1
+; RV32IMZICOND-NEXT:    addi a0, a0, 120
+; RV32IMZICOND-NEXT:    ret
+;
+; RV64IMZICOND-LABEL: select_cst_diff2_invert:
+; RV64IMZICOND:       # %bb.0:
+; RV64IMZICOND-NEXT:    slli a0, a0, 1
+; RV64IMZICOND-NEXT:    addiw a0, a0, 120
+; RV64IMZICOND-NEXT:    ret
   %ret = select i1 %cond, i32 122, i32 120
   ret i32 %ret
 }
@@ -1911,16 +1913,14 @@ define i32 @select_cst_diff4(i1 zeroext %cond) {
 ;
 ; RV64IMXVTCONDOPS-LABEL: select_cst_diff4:
 ; RV64IMXVTCONDOPS:       # %bb.0:
-; RV64IMXVTCONDOPS-NEXT:    li a1, -4
-; RV64IMXVTCONDOPS-NEXT:    vt.maskcn a0, a1, a0
-; RV64IMXVTCONDOPS-NEXT:    addi a0, a0, 10
+; RV64IMXVTCONDOPS-NEXT:    slli a0, a0, 2
+; RV64IMXVTCONDOPS-NEXT:    addi a0, a0, 6
 ; RV64IMXVTCONDOPS-NEXT:    ret
 ;
 ; CHECKZICOND-LABEL: select_cst_diff4:
 ; CHECKZICOND:       # %bb.0:
-; CHECKZICOND-NEXT:    li a1, -4
-; CHECKZICOND-NEXT:    czero.nez a0, a1, a0
-; CHECKZICOND-NEXT:    addi a0, a0, 10
+; CHECKZICOND-NEXT:    slli a0, a0, 2
+; CHECKZICOND-NEXT:    addi a0, a0, 6
 ; CHECKZICOND-NEXT:    ret
   %ret = select i1 %cond, i32 10, i32 6
   ret i32 %ret
@@ -1987,17 +1987,21 @@ define i32 @select_cst_diff8(i1 zeroext %cond) {
 ;
 ; RV64IMXVTCONDOPS-LABEL: select_cst_diff8:
 ; RV64IMXVTCONDOPS:       # %bb.0:
-; RV64IMXVTCONDOPS-NEXT:    li a1, -8
-; RV64IMXVTCONDOPS-NEXT:    vt.maskcn a0, a1, a0
-; RV64IMXVTCONDOPS-NEXT:    addi a0, a0, 14
+; RV64IMXVTCONDOPS-NEXT:    slli a0, a0, 3
+; RV64IMXVTCONDOPS-NEXT:    addiw a0, a0, 6
 ; RV64IMXVTCONDOPS-NEXT:    ret
 ;
-; CHECKZICOND-LABEL: select_cst_diff8:
-; CHECKZICOND:       # %bb.0:
-; CHECKZICOND-NEXT:    li a1, -8
-; CHECKZICOND-NEXT:    czero.nez a0, a1, a0
-; CHECKZICOND-NEXT:    addi a0, a0, 14
-; CHECKZICOND-NEXT:    ret
+; RV32IMZICOND-LABEL: select_cst_diff8:
+; RV32IMZICOND:       # %bb.0:
+; RV32IMZICOND-NEXT:    slli a0, a0, 3
+; RV32IMZICOND-NEXT:    addi a0, a0, 6
+; RV32IMZICOND-NEXT:    ret
+;
+; RV64IMZICOND-LABEL: select_cst_diff8:
+; RV64IMZICOND:       # %bb.0:
+; RV64IMZICOND-NEXT:    slli a0, a0, 3
+; RV64IMZICOND-NEXT:    addiw a0, a0, 6
+; RV64IMZICOND-NEXT:    ret
   %ret = select i1 %cond, i32 14, i32 6
   ret i32 %ret
 }
@@ -2071,17 +2075,21 @@ define i32 @select_cst_diff1024(i1 zeroext %cond) {
 ;
 ; RV64IMXVTCONDOPS-LABEL: select_cst_diff1024:
 ; RV64IMXVTCONDOPS:       # %bb.0:
-; RV64IMXVTCONDOPS-NEXT:    li a1, -1024
-; RV64IMXVTCONDOPS-NEXT:    vt.maskcn a0, a1, a0
-; RV64IMXVTCONDOPS-NEXT:    addi a0, a0, 1030
+; RV64IMXVTCONDOPS-NEXT:    slli a0, a0, 10
+; RV64IMXVTCONDOPS-NEXT:    addiw a0, a0, 6
 ; RV64IMXVTCONDOPS-NEXT:    ret
 ;
-; CHECKZICOND-LABEL: select_cst_diff1024:
-; CHECKZICOND:       # %bb.0:
-; CHECKZICOND-NEXT:    li a1, -1024
-; CHECKZICOND-NEXT:    czero.nez a0, a1, a0
-; CHECKZICOND-NEXT:    addi a0, a0, 1030
-; CHECKZICOND-NEXT:    ret
+; RV32IMZICOND-LABEL: select_cst_diff1024:
+; RV32IMZICOND:       # %bb.0:
+; RV32IMZICOND-NEXT:    slli a0, a0, 10
+; RV32IMZICOND-NEXT:    addi a0, a0, 6
+; RV32IMZICOND-NEXT:    ret
+;
+; RV64IMZICOND-LABEL: select_cst_diff1024:
+; RV64IMZICOND:       # %bb.0:
+; RV64IMZICOND-NEXT:    slli a0, a0, 10
+; RV64IMZICOND-NEXT:    addiw a0, a0, 6
+; RV64IMZICOND-NEXT:    ret
   %ret = select i1 %cond, i32 1030, i32 6
   ret i32 %ret
 }

@dtcxzyw dtcxzyw requested review from wangpc-pp and removed request for pcwang-thead August 27, 2025 15:50
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@preames preames merged commit 811880a into llvm:main Aug 27, 2025
11 checks passed
@preames preames deleted the pr-riscv-select-of-constant-shift branch August 27, 2025 17:31
preames added a commit to preames/llvm-project that referenced this pull request Aug 28, 2025
This case is the inverse of the one introuced in llvm#155644.  The
complexity with the inversion is that we need to also invert the
condition before shifting it.  I had originally planned to only
do so when the condition was "cheaply" invertible (i.e. didn't
require the xori), but when looking more closely at the diffs I
noticed that while the XORI prevents this from being an icount
improvement, and actually lengthens slightly the critical path,
it does still reduce the number of registers needed.
preames added a commit that referenced this pull request Aug 28, 2025
…155845)

This case is the inverse of the one introduced in #155644. The
complexity with the inversion is that we need to also invert the
condition before shifting it. I had originally planned to only do so
when the condition was "cheaply" invertible (i.e. didn't require the
xori), but when looking more closely at the diffs I noticed that while
the XORI prevents this from being an icount improvement, and actually
lengthens slightly the critical path, it does still reduce the number of
registers needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants