[RISCV] Select and/or/xor with certain constants to Zbb ANDN/ORN/XNOR #120221

pfusik · 2024-12-17T12:09:33Z

(and X, (C<<12|0xfff)) -> (ANDN X, ~C<<12)
(or  X, (C<<12|0xfff)) -> (ORN  X, ~C<<12)
(xor X, (C<<12|0xfff)) -> (XNOR X, ~C<<12)

Emits better code, typically by avoiding an ADDI HI, -1 instruction.

Co-authored-by: Craig Topper [email protected]

pfusik · 2024-12-17T12:11:59Z

I'd like to implement an optimization so that

long orlow(long x) {
    return x | ((1L << 24) - 1);
}

that is now compiled into:

        lui     a1, 4096
        addiw   a1, a1, -1
        or      a0, a0, a1
        ret

gets instead compiled to:

        lui     a1, -4096
        orn     a0, a0, a1
        ret

and similarly for AND and XOR.

I'm trying to implement this with tablegen patterns, but I'm new to that technology and I got stuck with the following error:

[16/130 -j3] Building RISCVGenGlobalISel.inc...
FAILED: lib/Target/RISCV/RISCVGenGlobalISel.inc /home/p.fusik/upstream/llvm-project/build/lib/Target/RISCV/RISCVGenGlobalISel.inc
cd /home/p.fusik/upstream/llvm-project/build && /home/p.fusik/upstream/llvm-project/build/bin/llvm-tblgen -gen-global-isel -I /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV -I/home/p.fusik/upstream/llvm-project/build/include -I/home/p.fusik/upstream/llvm-project/llvm/include -I /home/p.fusik/upstream/llvm-project/llvm/lib/Target /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVGISel.td --write-if-changed -o lib/Target/RISCV/RISCVGenGlobalISel.inc -d lib/Target/RISCV/RISCVGenGlobalISel.inc.d
anonymous_10632:        (or:{ *:[i32] m1:[i64] } (vt:{} GPR:{ *:[i16 i32] m1:[i16 i32 i64 f64] }:$rs1), (imm:{ *:[i32] m1:[i64] })<<P:Predicate_simm32fff>>:$imm)
Included from /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVGISel.td:16:
Included from /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCV.td:36:
Included from /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfo.td:2113:
/home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td:499:1: error: In anonymous_10632: vt node requires exactly 0 operands!
def : PatGprSimm32fff<or, ORN>;
^
anonymous_10632:        (or:{ *:[i32] m1:[i64] } (vt:{} GPR:{ *:[i16 i32] m1:[i16 i32 i64 f64] }:$rs1), (imm:{ *:[i32] m1:[i64] })<<P:Predicate_simm32fff>>:$imm)
anonymous_10632:        (ORN:{ *:[i32] m1:[i64] } GPR:{ *:[i16 i32] m1:[i16 i32 i64 f64] }:$rs1, (NotImm:{ *:[i16 i32] m1:[i16 i32 i64 f64] } (imm:{ *:[i32] m1:[i64] })<<P:Predicate_simm32fff>>:$imm))
Included from /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVGISel.td:16:
Included from /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCV.td:36:
Included from /home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfo.td:2113:
/home/p.fusik/upstream/llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td:499:1: error: In anonymous_10632: Could not infer all types in pattern result!
def : PatGprSimm32fff<or, ORN>;
^

Please let me know how can I fix it or if perhaps this approach is completely wrong.

llvm/lib/Target/RISCV/RISCVInstrInfoZb.td

topperc · 2024-12-17T19:39:34Z

Here's a rough sketch using a ComplexPattern instead of PatLeaf/SDNodeXForm. This lets us reuse selectImm to create the LUI.

diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index ccf34b8a6b2b..ae04b705a0f4 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3236,6 +3236,18 @@ bool RISCVDAGToDAGISel::selectSHXADD_UWOp(SDValue N, unsigned ShAmt,
   return false;
 }
 
+bool RISCVDAGToDAGISel::selectSImm32ffff(SDValue N, SDValue &Val) {
+  if (!isa<ConstantSDNode>(N))
+    return false;
+
+  int64_t Imm = cast<ConstantSDNode>(N)->getSExtValue();
+  if (!(isInt<32>(Imm) && (Imm & 0xfff) == 0xfff && Imm != -1))
+    return false;
+
+  Val = selectImm(CurDAG, SDLoc(N), N->getSimpleValueType(0), ~Imm, *Subtarget);
+  return true;
+}
+
 static bool vectorPseudoHasAllNBitUsers(SDNode *User, unsigned UserOpNo,
                                         unsigned Bits,
                                         const TargetInstrInfo *TII) {
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
index 2e738d8d25a6..410ae4ef8c3c 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
@@ -119,6 +119,8 @@ public:
     return selectSHXADD_UWOp(N, ShAmt, Val);
   }
 
+  bool selectSImm32ffff(SDValue N, SDValue &Val);
+
   bool hasAllNBitUsers(SDNode *Node, unsigned Bits,
                        const unsigned Depth = 0) const;
   bool hasAllBUsers(SDNode *Node) const { return hasAllNBitUsers(Node, 8); }
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
index a78091cd02a3..ff9bdea6252b 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
@@ -475,10 +475,15 @@ def : InstAlias<"zext.h $rd, $rs", (PACKW GPR:$rd, GPR:$rs, X0)>;
 // Codegen patterns
 //===----------------------------------------------------------------------===//
 
+def simm32fff : ComplexPattern<XLenVT, 1, "selectSImm32ffff", [], [], 0>;
+
 let Predicates = [HasStdExtZbbOrZbkb] in {
 def : Pat<(XLenVT (and GPR:$rs1, (not GPR:$rs2))), (ANDN GPR:$rs1, GPR:$rs2)>;
 def : Pat<(XLenVT (or  GPR:$rs1, (not GPR:$rs2))), (ORN  GPR:$rs1, GPR:$rs2)>;
 def : Pat<(XLenVT (xor GPR:$rs1, (not GPR:$rs2))), (XNOR GPR:$rs1, GPR:$rs2)>;
+
+def : Pat<(XLenVT (or GPR:$rs1, simm32fff:$rs2)), (ORN GPR:$rs1, simm32fff:$rs2)>;
+
 } // Predicates = [HasStdExtZbbOrZbkb]
 
 let Predicates = [HasStdExtZbbOrZbkb] in {

topperc · 2024-12-17T20:03:17Z

Here's another case that might be useful to optimize https://godbolt.org/z/MEzP15sas it already generates a not but we don't fold it.

pfusik · 2024-12-18T13:41:27Z

Force pushed because of a precommit test.

pfusik · 2024-12-18T13:42:33Z

llvm/test/CodeGen/RISCV/pr84653_pr85190.ll

@@ -21,8 +21,7 @@ define i1 @pr84653(i32 %x) {
 ; CHECK-ZBB:       # %bb.0:
 ; CHECK-ZBB-NEXT:    sext.w a1, a0
 ; CHECK-ZBB-NEXT:    lui a2, 524288


lui is unchanged here because it's a X ^ ((1 << 31) - 1) -> X ^ ~(1 << 31) case.

llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll

pfusik · 2024-12-18T13:56:11Z

llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll

@@ -273,37 +240,21 @@ define i32 @compl(i32 %x) {
 }

 define i32 @orlow12(i32 %x) {


With Zbs this is:

ori a0, a0, 2047 bseti a0, a0, 11

and is not affected by this change. Shall I add +zbs or test both with and without?

I would add RUN lines with +zbs, rather than adding +zbs to the existing RUN lines.

llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll

pfusik · 2024-12-18T14:02:37Z

Here's a rough sketch using a ComplexPattern instead of PatLeaf/SDNodeXForm.

This works great!

I didn't even have to special-case and for sll+srl etc. But I don't know why. Does the pattern order matter or is there some scoring mechanism?

llvmbot · 2024-12-18T14:05:59Z

@llvm/pr-subscribers-backend-risc-v

Author: Piotr Fusik (pfusik)

Changes

(and X, (C&lt;&lt;12|0xfff)) -&gt; (ANDN X, ~C&lt;&lt;12)
(or  X, (C&lt;&lt;12|0xfff)) -&gt; (ORN  X, ~C&lt;&lt;12)
(xor X, (C&lt;&lt;12|0xfff)) -&gt; (XNOR X, ~C&lt;&lt;12)

Saves an ADDI HI, -1 instruction.

Full diff: https://github.com/llvm/llvm-project/pull/120221.diff

5 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+12)
(modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h (+2)
(modified) llvm/lib/Target/RISCV/RISCVInstrInfoZb.td (+6)
(modified) llvm/test/CodeGen/RISCV/pr84653_pr85190.ll (+2-4)
(added) llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll (+260)

diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index ccf34b8a6b2b02..d77e2b1421b136 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3236,6 +3236,18 @@ bool RISCVDAGToDAGISel::selectSHXADD_UWOp(SDValue N, unsigned ShAmt,
   return false;
 }
 
+bool RISCVDAGToDAGISel::selectSImm32fff(SDValue N, SDValue &Val) {
+  if (!isa<ConstantSDNode>(N))
+    return false;
+
+  int64_t Imm = cast<ConstantSDNode>(N)->getSExtValue();
+  if (!(isInt<32>(Imm) && (Imm & 0xfff) == 0xfff && Imm != -1))
+    return false;
+
+  Val = selectImm(CurDAG, SDLoc(N), N->getSimpleValueType(0), ~Imm, *Subtarget);
+  return true;
+}
+
 static bool vectorPseudoHasAllNBitUsers(SDNode *User, unsigned UserOpNo,
                                         unsigned Bits,
                                         const TargetInstrInfo *TII) {
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
index 2e738d8d25a6dc..cfe07277fd9ddf 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
@@ -119,6 +119,8 @@ class RISCVDAGToDAGISel : public SelectionDAGISel {
     return selectSHXADD_UWOp(N, ShAmt, Val);
   }
 
+  bool selectSImm32fff(SDValue N, SDValue &Val);
+
   bool hasAllNBitUsers(SDNode *Node, unsigned Bits,
                        const unsigned Depth = 0) const;
   bool hasAllBUsers(SDNode *Node) const { return hasAllNBitUsers(Node, 8); }
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
index a78091cd02a35f..e2050d100e4f88 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
@@ -475,10 +475,16 @@ def : InstAlias<"zext.h $rd, $rs", (PACKW GPR:$rd, GPR:$rs, X0)>;
 // Codegen patterns
 //===----------------------------------------------------------------------===//
 
+def simm32fff : ComplexPattern<XLenVT, 1, "selectSImm32fff", [], [], 0>;
+
 let Predicates = [HasStdExtZbbOrZbkb] in {
 def : Pat<(XLenVT (and GPR:$rs1, (not GPR:$rs2))), (ANDN GPR:$rs1, GPR:$rs2)>;
 def : Pat<(XLenVT (or  GPR:$rs1, (not GPR:$rs2))), (ORN  GPR:$rs1, GPR:$rs2)>;
 def : Pat<(XLenVT (xor GPR:$rs1, (not GPR:$rs2))), (XNOR GPR:$rs1, GPR:$rs2)>;
+
+def : Pat<(XLenVT (and GPR:$rs1, simm32fff:$rs2)), (ANDN GPR:$rs1, simm32fff:$rs2)>;
+def : Pat<(XLenVT (or  GPR:$rs1, simm32fff:$rs2)), (ORN  GPR:$rs1, simm32fff:$rs2)>;
+def : Pat<(XLenVT (xor GPR:$rs1, simm32fff:$rs2)), (XNOR GPR:$rs1, simm32fff:$rs2)>;
 } // Predicates = [HasStdExtZbbOrZbkb]
 
 let Predicates = [HasStdExtZbbOrZbkb] in {
diff --git a/llvm/test/CodeGen/RISCV/pr84653_pr85190.ll b/llvm/test/CodeGen/RISCV/pr84653_pr85190.ll
index b1bba5fdc92116..30a93557347727 100644
--- a/llvm/test/CodeGen/RISCV/pr84653_pr85190.ll
+++ b/llvm/test/CodeGen/RISCV/pr84653_pr85190.ll
@@ -21,8 +21,7 @@ define i1 @pr84653(i32 %x) {
 ; CHECK-ZBB:       # %bb.0:
 ; CHECK-ZBB-NEXT:    sext.w a1, a0
 ; CHECK-ZBB-NEXT:    lui a2, 524288
-; CHECK-ZBB-NEXT:    addi a2, a2, -1
-; CHECK-ZBB-NEXT:    xor a0, a0, a2
+; CHECK-ZBB-NEXT:    xnor a0, a0, a2
 ; CHECK-ZBB-NEXT:    sext.w a0, a0
 ; CHECK-ZBB-NEXT:    max a0, a0, zero
 ; CHECK-ZBB-NEXT:    slt a0, a0, a1
@@ -82,8 +81,7 @@ define i1 @select_to_or(i32 %x) {
 ; CHECK-ZBB:       # %bb.0:
 ; CHECK-ZBB-NEXT:    sext.w a1, a0
 ; CHECK-ZBB-NEXT:    lui a2, 524288
-; CHECK-ZBB-NEXT:    addi a2, a2, -1
-; CHECK-ZBB-NEXT:    xor a0, a0, a2
+; CHECK-ZBB-NEXT:    xnor a0, a0, a2
 ; CHECK-ZBB-NEXT:    sext.w a0, a0
 ; CHECK-ZBB-NEXT:    min a0, a0, zero
 ; CHECK-ZBB-NEXT:    slt a0, a0, a1
diff --git a/llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll b/llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll
new file mode 100644
index 00000000000000..86c9676bec1280
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll
@@ -0,0 +1,260 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+zbb -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefixes=CHECK,RV32
+; RUN: llc -mtriple=riscv64 -mattr=+zbb -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefixes=CHECK,RV64
+
+define i32 @and0xabcdefff(i32 %x) {
+; CHECK-LABEL: and0xabcdefff:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 344865
+; CHECK-NEXT:    andn a0, a0, a1
+; CHECK-NEXT:    ret
+  %and = and i32 %x, -1412567041
+  ret i32 %and
+}
+
+define i32 @orlow13(i32 %x) {
+; CHECK-LABEL: orlow13:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 1048574
+; CHECK-NEXT:    orn a0, a0, a1
+; CHECK-NEXT:    ret
+  %or = or i32 %x, 8191
+  ret i32 %or
+}
+
+define i64 @orlow24(i64 %x) {
+; RV32-LABEL: orlow24:
+; RV32:       # %bb.0:
+; RV32-NEXT:    lui a2, 1044480
+; RV32-NEXT:    orn a0, a0, a2
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: orlow24:
+; RV64:       # %bb.0:
+; RV64-NEXT:    lui a1, 1044480
+; RV64-NEXT:    orn a0, a0, a1
+; RV64-NEXT:    ret
+  %or = or i64 %x, 16777215
+  ret i64 %or
+}
+
+define i32 @xorlow16(i32 %x) {
+; CHECK-LABEL: xorlow16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 1048560
+; CHECK-NEXT:    xnor a0, a0, a1
+; CHECK-NEXT:    ret
+  %xor = xor i32 %x, 65535
+  ret i32 %xor
+}
+
+define i32 @xorlow31(i32 %x) {
+; CHECK-LABEL: xorlow31:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 524288
+; CHECK-NEXT:    xnor a0, a0, a1
+; CHECK-NEXT:    ret
+  %xor = xor i32 %x, 2147483647
+  ret i32 %xor
+}
+
+define i32 @oraddlow16(i32 %x) {
+; RV32-LABEL: oraddlow16:
+; RV32:       # %bb.0:
+; RV32-NEXT:    lui a1, 1048560
+; RV32-NEXT:    orn a0, a0, a1
+; RV32-NEXT:    lui a1, 16
+; RV32-NEXT:    addi a1, a1, -1
+; RV32-NEXT:    add a0, a0, a1
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: oraddlow16:
+; RV64:       # %bb.0:
+; RV64-NEXT:    lui a1, 1048560
+; RV64-NEXT:    orn a0, a0, a1
+; RV64-NEXT:    lui a1, 16
+; RV64-NEXT:    addi a1, a1, -1
+; RV64-NEXT:    addw a0, a0, a1
+; RV64-NEXT:    ret
+  %or = or i32 %x, 65535
+  %add = add nsw i32 %or, 65535
+  ret i32 %add
+}
+
+define i32 @addorlow16(i32 %x) {
+; RV32-LABEL: addorlow16:
+; RV32:       # %bb.0:
+; RV32-NEXT:    lui a1, 16
+; RV32-NEXT:    addi a1, a1, -1
+; RV32-NEXT:    add a0, a0, a1
+; RV32-NEXT:    lui a1, 1048560
+; RV32-NEXT:    orn a0, a0, a1
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: addorlow16:
+; RV64:       # %bb.0:
+; RV64-NEXT:    lui a1, 16
+; RV64-NEXT:    addi a1, a1, -1
+; RV64-NEXT:    addw a0, a0, a1
+; RV64-NEXT:    lui a1, 1048560
+; RV64-NEXT:    orn a0, a0, a1
+; RV64-NEXT:    ret
+  %add = add nsw i32 %x, 65535
+  %or = or i32 %add, 65535
+  ret i32 %or
+}
+
+define i32 @andxorlow16(i32 %x) {
+; RV32-LABEL: andxorlow16:
+; RV32:       # %bb.0:
+; RV32-NEXT:    lui a1, 16
+; RV32-NEXT:    addi a1, a1, -1
+; RV32-NEXT:    andn a0, a1, a0
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: andxorlow16:
+; RV64:       # %bb.0:
+; RV64-NEXT:    lui a1, 16
+; RV64-NEXT:    addiw a1, a1, -1
+; RV64-NEXT:    andn a0, a1, a0
+; RV64-NEXT:    ret
+  %and = and i32 %x, 65535
+  %xor = xor i32 %and, 65535
+  ret i32 %xor
+}
+
+define void @orarray100(ptr %a) {
+; RV32-LABEL: orarray100:
+; RV32:       # %bb.0: # %entry
+; RV32-NEXT:    li a1, 0
+; RV32-NEXT:    li a2, 0
+; RV32-NEXT:    lui a3, 1048560
+; RV32-NEXT:  .LBB8_1: # %for.body
+; RV32-NEXT:    # =>This Inner Loop Header: Depth=1
+; RV32-NEXT:    slli a4, a1, 2
+; RV32-NEXT:    addi a1, a1, 1
+; RV32-NEXT:    add a4, a0, a4
+; RV32-NEXT:    lw a5, 0(a4)
+; RV32-NEXT:    seqz a6, a1
+; RV32-NEXT:    add a2, a2, a6
+; RV32-NEXT:    xori a6, a1, 100
+; RV32-NEXT:    orn a5, a5, a3
+; RV32-NEXT:    or a6, a6, a2
+; RV32-NEXT:    sw a5, 0(a4)
+; RV32-NEXT:    bnez a6, .LBB8_1
+; RV32-NEXT:  # %bb.2: # %for.cond.cleanup
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: orarray100:
+; RV64:       # %bb.0: # %entry
+; RV64-NEXT:    addi a1, a0, 400
+; RV64-NEXT:    lui a2, 1048560
+; RV64-NEXT:  .LBB8_1: # %for.body
+; RV64-NEXT:    # =>This Inner Loop Header: Depth=1
+; RV64-NEXT:    lw a3, 0(a0)
+; RV64-NEXT:    orn a3, a3, a2
+; RV64-NEXT:    sw a3, 0(a0)
+; RV64-NEXT:    addi a0, a0, 4
+; RV64-NEXT:    bne a0, a1, .LBB8_1
+; RV64-NEXT:  # %bb.2: # %for.cond.cleanup
+; RV64-NEXT:    ret
+entry:
+  br label %for.body
+
+for.cond.cleanup:
+  ret void
+
+for.body:
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %arrayidx = getelementptr inbounds nuw i32, ptr %a, i64 %indvars.iv
+  %1 = load i32, ptr %arrayidx, align 4
+  %or = or i32 %1, 65535
+  store i32 %or, ptr %arrayidx, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond.not = icmp eq i64 %indvars.iv.next, 100
+  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
+}
+
+define void @orarray3(ptr %a) {
+; CHECK-LABEL: orarray3:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lw a1, 0(a0)
+; CHECK-NEXT:    lw a2, 4(a0)
+; CHECK-NEXT:    lw a3, 8(a0)
+; CHECK-NEXT:    lui a4, 1048560
+; CHECK-NEXT:    orn a1, a1, a4
+; CHECK-NEXT:    orn a2, a2, a4
+; CHECK-NEXT:    orn a3, a3, a4
+; CHECK-NEXT:    sw a1, 0(a0)
+; CHECK-NEXT:    sw a2, 4(a0)
+; CHECK-NEXT:    sw a3, 8(a0)
+; CHECK-NEXT:    ret
+  %1 = load i32, ptr %a, align 4
+  %or = or i32 %1, 65535
+  store i32 %or, ptr %a, align 4
+  %arrayidx.1 = getelementptr inbounds nuw i8, ptr %a, i64 4
+  %2 = load i32, ptr %arrayidx.1, align 4
+  %or.1 = or i32 %2, 65535
+  store i32 %or.1, ptr %arrayidx.1, align 4
+  %arrayidx.2 = getelementptr inbounds nuw i8, ptr %a, i64 8
+  %3 = load i32, ptr %arrayidx.2, align 4
+  %or.2 = or i32 %3, 65535
+  store i32 %or.2, ptr %arrayidx.2, align 4
+  ret void
+}
+
+define i32 @andlow16(i32 %x) {
+; CHECK-LABEL: andlow16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    zext.h a0, a0
+; CHECK-NEXT:    ret
+  %and = and i32 %x, 65535
+  ret i32 %and
+}
+
+define i32 @andlow24(i32 %x) {
+; RV32-LABEL: andlow24:
+; RV32:       # %bb.0:
+; RV32-NEXT:    slli a0, a0, 8
+; RV32-NEXT:    srli a0, a0, 8
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: andlow24:
+; RV64:       # %bb.0:
+; RV64-NEXT:    slli a0, a0, 40
+; RV64-NEXT:    srli a0, a0, 40
+; RV64-NEXT:    ret
+  %and = and i32 %x, 16777215
+  ret i32 %and
+}
+
+define i32 @compl(i32 %x) {
+; CHECK-LABEL: compl:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    not a0, a0
+; CHECK-NEXT:    ret
+  %not = xor i32 %x, -1
+  ret i32 %not
+}
+
+define i32 @orlow12(i32 %x) {
+; CHECK-LABEL: orlow12:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 1048575
+; CHECK-NEXT:    orn a0, a0, a1
+; CHECK-NEXT:    ret
+  %or = or i32 %x, 4095
+  ret i32 %or
+}
+
+define i32 @xorlow12(i32 %x) {
+; CHECK-LABEL: xorlow12:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lui a1, 1048575
+; CHECK-NEXT:    xnor a0, a0, a1
+; CHECK-NEXT:    ret
+  %xor = xor i32 %x, 4095
+  ret i32 %xor
+}

pfusik · 2024-12-18T14:08:48Z

Here's another case that might be useful to optimize https://godbolt.org/z/MEzP15sas it already generates a not but we don't fold it.

I can work on it.

github-actions · 2024-12-18T17:36:26Z

✅ With the latest revision this PR passed the C/C++ code formatter.

topperc · 2024-12-18T17:47:25Z

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

+    return false;
+
+  int64_t Imm = cast<ConstantSDNode>(N)->getSExtValue();
+  if (!(isInt<32>(Imm) && (Imm & 0xfff) == 0xfff && Imm != -1))


Push the ! into the individual conditions and use ||.

pfusik

Added Zbs testing. Added andimm64 test. Squashed the change so that the effect on tests is clear.

pfusik · 2024-12-19T12:36:34Z

llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll

+; RV64-NEXT:    lui a1, 65281
+; RV64-NEXT:    slli a1, a1, 4
+; RV64-NEXT:    addi a1, a1, -1
+; RV64-NEXT:    and a0, a0, a1


Here it's possible to optimize a non-int32. Getting the condition right in general seems tricky. Can I call RISCVMatInt for Imm and ~Imm to see which one is shorter?

This wouldn't be the only place to compare two getIntMatCosts, so it should be fine, as long as you check the simpler exclusionary conditions first - maybe even including your isBitwiseLogicOp loop.

pfusik · 2024-12-19T17:00:29Z

Here's another case that might be useful to optimize https://godbolt.org/z/MEzP15sas it already generates a not but we don't fold it.

Done. Covered by the andimm64srli test.

lenary

I'm happy with this.

By default, LLVM will squash-merge, so can you pre-commit the new test file before landing this?

pfusik · 2024-12-19T17:31:19Z

By default, LLVM will squash-merge, so can you pre-commit the new test file before landing this?

Do you mean create a new PR for the pre-commit test?

topperc

LGTM

topperc · 2024-12-19T17:32:11Z

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

+  if ((Imm & 0xfff) != 0xfff || Imm == -1)
+    return false;
+
+  for (const SDNode *U : N->uses()) {


This needs to change N->users() now. See e6b2495

topperc · 2024-12-19T17:34:17Z

By default, LLVM will squash-merge, so can you pre-commit the new test file before landing this?

Do you mean create a new PR for the pre-commit test?

If you have commit access you can directly push it without going through github.

(and X, (C<<12|0xfff)) -> (ANDN X, ~C<<12) (or X, (C<<12|0xfff)) -> (ORN X, ~C<<12) (xor X, (C<<12|0xfff)) -> (XNOR X, ~C<<12) Emits better code, typically by avoiding an `ADDI HI, -1` instruction. Co-authored-by: Craig Topper <[email protected]>

lenary · 2024-12-23T10:17:02Z

Nice work!

pfusik · 2024-12-24T08:01:15Z

Thank you.

This extends PR llvm#120221 to 64-bit constants that don't match the 12-low-bits-set pattern.

This extends PR #120221 to 64-bit constants that don't match the 12-low-bits-set pattern.

This extends PR llvm#120221 to vector instructions.

…3345) This extends PR #120221 to vector instructions.

RV64 only. For 32-bit constants, a negated constant is never cheaper. This change is similar to how llvm#120221 selects inverted bitwise instructions.

…#137309) RV64 only. For 32-bit constants, a negated constant is never cheaper. This change is similar to how #120221 selects inverted bitwise instructions.

…llvm#137309) RV64 only. For 32-bit constants, a negated constant is never cheaper. This change is similar to how llvm#120221 selects inverted bitwise instructions.

pfusik changed the title ~~[RISCV] Select and/or/xor with some constants to Zbb ANDN/ORN/XNOR~~ [RISCV] Select and/or/xor with certain constants to Zbb ANDN/ORN/XNOR Dec 17, 2024

pfusik requested review from mshockwave and topperc December 17, 2024 12:18

mshockwave reviewed Dec 17, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVInstrInfoZb.td Outdated Show resolved Hide resolved

topperc reviewed Dec 17, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVInstrInfoZb.td Outdated Show resolved Hide resolved

pfusik force-pushed the orn branch from f13007f to 70c4709 Compare December 18, 2024 13:41

pfusik commented Dec 18, 2024

View reviewed changes

llvm/test/CodeGen/RISCV/zbb-logic-neg-imm.ll Show resolved Hide resolved

pfusik marked this pull request as ready for review December 18, 2024 14:05

llvmbot added the backend:RISC-V label Dec 18, 2024

topperc reviewed Dec 18, 2024

View reviewed changes

pfusik force-pushed the orn branch from 2bbf1a1 to 1cc0552 Compare December 19, 2024 12:34

pfusik commented Dec 19, 2024

View reviewed changes

pfusik force-pushed the orn branch from 1cc0552 to aa65022 Compare December 19, 2024 16:58

lenary approved these changes Dec 19, 2024

View reviewed changes

topperc approved these changes Dec 19, 2024

View reviewed changes

topperc reviewed Dec 19, 2024

View reviewed changes

pfusik force-pushed the orn branch from 7aaeda4 to 81e01d1 Compare December 19, 2024 18:06

pfusik merged commit 6e7312b into llvm:main Dec 19, 2024
8 checks passed

pfusik added a commit to pfusik/llvm-project that referenced this pull request Jan 13, 2025

[RISCV] Enable Zbb ANDN/ORN/XNOR for more 64-bit constants

0e599a3

This extends PR llvm#120221 to 64-bit constants that don't match the 12-low-bits-set pattern.

pfusik mentioned this pull request Jan 13, 2025

[RISCV] Enable Zbb ANDN/ORN/XNOR for more 64-bit constants #122698

Merged

pfusik added a commit to pfusik/llvm-project that referenced this pull request Jan 13, 2025

[RISCV] Enable Zbb ANDN/ORN/XNOR for more 64-bit constants

d20c420

This extends PR llvm#120221 to 64-bit constants that don't match the 12-low-bits-set pattern.

pfusik added a commit to pfusik/llvm-project that referenced this pull request Jan 14, 2025

[RISCV] Enable Zbb ANDN/ORN/XNOR for more 64-bit constants

ddad0f8

This extends PR llvm#120221 to 64-bit constants that don't match the 12-low-bits-set pattern.

pfusik added a commit that referenced this pull request Jan 14, 2025

[RISCV] Enable Zbb ANDN/ORN/XNOR for more 64-bit constants (#122698)

cfe5a08

This extends PR #120221 to 64-bit constants that don't match the 12-low-bits-set pattern.

pfusik added a commit to pfusik/llvm-project that referenced this pull request Jan 17, 2025

[RISCV] Select Zvkb VANDN for shorter constant loading sequences

a8aad7b

This extends PR llvm#120221 to vector instructions.

pfusik mentioned this pull request Jan 17, 2025

[RISCV] Select Zvkb VANDN for shorter constant loading sequences #123345

Merged

pfusik added a commit to pfusik/llvm-project that referenced this pull request Jan 20, 2025

[RISCV] Select Zvkb VANDN for shorter constant loading sequences

1b658df

This extends PR llvm#120221 to vector instructions.

pfusik added a commit to pfusik/llvm-project that referenced this pull request Jan 22, 2025

[RISCV] Select Zvkb VANDN for shorter constant loading sequences

cb251f3

This extends PR llvm#120221 to vector instructions.

pfusik added a commit that referenced this pull request Jan 22, 2025

[RISCV] Select Zvkb VANDN for shorter constant loading sequences (#12…

ebb27cc

…3345) This extends PR #120221 to vector instructions.

pfusik mentioned this pull request Apr 25, 2025

[RISCV] Select (add x, C) -> (sub x, -C) if -C cheaper to materialize #137309

Merged

pfusik mentioned this pull request Jun 2, 2025

[RISCV] Fold LI 1 / SLLI into BSETI during i64 materialization #142348

Merged

		@@ -273,37 +240,21 @@ define i32 @compl(i32 %x) {
		}

		define i32 @orlow12(i32 %x) {

[RISCV] Select and/or/xor with certain constants to Zbb ANDN/ORN/XNOR #120221

[RISCV] Select and/or/xor with certain constants to Zbb ANDN/ORN/XNOR #120221

Uh oh!

Conversation

pfusik commented Dec 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pfusik commented Dec 17, 2024

Uh oh!

Uh oh!

Uh oh!

topperc commented Dec 17, 2024

Uh oh!

topperc commented Dec 17, 2024

Uh oh!

pfusik commented Dec 18, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pfusik commented Dec 18, 2024

Uh oh!

llvmbot commented Dec 18, 2024

Uh oh!

pfusik commented Dec 18, 2024

Uh oh!

github-actions bot commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pfusik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pfusik commented Dec 19, 2024

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

pfusik commented Dec 19, 2024

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

topperc Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc commented Dec 19, 2024

Uh oh!

Uh oh!

lenary commented Dec 23, 2024

Uh oh!

pfusik commented Dec 24, 2024

Uh oh!

Uh oh!

pfusik commented Dec 17, 2024 •

edited

Loading

github-actions bot commented Dec 18, 2024 •

edited

Loading

topperc Dec 19, 2024 •

edited

Loading