Skip to content

[PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C)) #141733

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 29, 2025

Conversation

tonykuttai
Copy link
Contributor

@tonykuttai tonykuttai commented May 28, 2025

Description

Adds support for ternary equivalent operations of the form ternary(A, X, and(B,C)) where X=[xor(B,C)| nor(B,C)| eqv(B,C)| not(B)| not(C)].

List of xxeval equivalent ternary operations added and the corresponding imm value required:

Ternary Operator Imm Value
ternary(A, xor(B,C), and(B,C)) 22
ternary(A, nor(B,C), and(B,C)) 24
ternary(A, eqv(B,C), and(B,C)) 25
ternary(A, not(C), and(B,C)) 26
ternary(A, not(B), and(B,C)) 28

eg. xxeval XT,XA,XB,XC,22

  • performs XA ? xor(XB, XC) : and(XB,XC)and places the result in XT.

@tonykuttai tonykuttai marked this pull request as ready for review May 28, 2025 10:46
@llvmbot
Copy link
Member

llvmbot commented May 28, 2025

@llvm/pr-subscribers-backend-powerpc

Author: Tony Varghese (tonykuttai)

Changes

Description

<!--- Title/Description will be Subject/Body of commit message. -->
<!--- Please be concise and limit the subject line to 50 characters, -->
<!--- and wrap the Description at 72 characters. -->
<!--- Describe why this is required, what problem it solves. -->
xxeval instruction can be used for supporting the ternary patterns.

Supporting the following patterns in this change:

  • ternary(A, X, and(B,C))
  • ternary(A, X, B)
  • ternary(A, X, C)
  • ternary(A, X, xor(B,C))
    This change make use of the patterns and its corresponding imm values to group the ternary operations so that appropriate tableGen multiclass were formed. The following patterns are handled:
Ternary Operators                     Imm Values
ternary(A,  xor(B,C),   and(B,C))	22
ternary(A,  nor(B,C),   and(B,C))	24
ternary(A,  eqv(B,C),   and(B,C))	25
ternary(A,  not(C),     and(B,C))	26
ternary(A,  not(B),     and(B,C))	28

ternary(A,  and(B,C),   B)	        49
ternary(A,  nor(B,C),   B)	        56
ternary(A,  eqv(B,C),   B)	        57
ternary(A,  not(C),     B)	        58
ternary(A,  nand(B,C),  B)	        62

ternary(A,  and(B,C),   C)	        81
ternary(A,  nor(B,C),   C)	        88
ternary(A,  eqv(B,C),   C)	        89
ternary(A,  nand(B,C),  C)	        94

ternary(A,  and(B,C),   xor(B,C))	97
ternary(A,  B,          xor(B,C))	99
ternary(A,  C,          xor(B,C))	101
ternary(A,  or(B,C),    xor(B,C))	103
ternary(A,  nor(B,C),   xor(B,C))	104

ternary(A, not(B), C) : 92 is omitted as it was a symmetrical to ternary(A, not(C), B) : 58


Patch is 33.13 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141733.diff

5 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+170-32)
  • (added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll (+134)
  • (added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll (+126)
  • (added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll (+104)
  • (added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-xor.ll (+128)
diff --git a/llvm/lib/Target/PowerPC/PPCInstrP10.td b/llvm/lib/Target/PowerPC/PPCInstrP10.td
index 39a1ab0d388a7..99f77e18e43a0 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrP10.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrP10.td
@@ -2159,8 +2159,133 @@ let AddedComplexity = 400, Predicates = [IsISA3_1, HasVSX] in {
                                (COPY_TO_REGCLASS $VRB, VSRC), 2)))>;
 }
 
-class xxevalPattern <dag pattern, bits<8> imm> :
-  Pat<(v4i32 pattern), (XXEVAL $vA, $vB, $vC, imm)> {}
+class xxevalPattern <ValueType vt, dag pattern, bits<8> imm> :
+  Pat<(vt pattern), (XXEVAL $vA, $vB, $vC, imm)> {}
+
+class DagUnaryVNot<ValueType vt, string opstr>{
+  // Defines a class that returns the UnaryVNot dag for an operand string based on a value type.
+  dag res = !cond(
+          !eq(vt, v4i32) : !dag(vnot, [v4i32], [opstr]),
+          !eq(vt, v2i64) : (v2i64 (bitconvert (vnot (v4i32 !dag(bitconvert, [v2i64], [opstr])))))
+          );
+}
+
+class DagCondVNot<dag d, bit negate> {
+  // Defines a class that generates a vnot around the dag.
+  dag res = !if(!ne(negate, 0),
+               (vnot d),
+               d);
+}
+
+class XXEvalUnaryNot<ValueType vt> {
+  // Defines a wrapper class for unary NOT operations for v4i32 and v2i64 vector types.
+  // Unary NOT on operand B or C based on value type.
+  dag opB = DagUnaryVNot<vt, "vB">.res;
+  dag opC = DagUnaryVNot<vt, "vC">.res;
+}
+
+class XXEvalBinaryPattern<ValueType vt, SDPatternOperator op, bit notResult = 0> {
+  // Defines a wrapper class for binary patterns with optional NOT on result.
+  // Generate op pattern with optional NOT wrapping for result depending on "notResult".
+      dag opPat = !cond(
+                !eq(vt, v4i32) : DagCondVNot<(op v4i32:$vB, v4i32:$vC), notResult>.res,
+                !eq(vt, v2i64) : (v2i64 (bitconvert DagCondVNot<(op
+                                      (v4i32 (bitconvert v2i64:$vB)),
+                                      (v4i32 (bitconvert v2i64:$vC))), notResult>.res))
+                );
+}
+
+multiclass XXEvalVSelectWithXAnd<ValueType vt, bits<8> baseImm> {
+  // Multiclass for Ternary(A, X, and(B, C)) style patterns.
+  // Ternary(A, xor(B,C), and(B,C)) => imm: baseImm
+  def : xxevalPattern<vt, 
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor>.opPat, XXEvalBinaryPattern<vt, and>.opPat), 
+        baseImm>;
+  // Ternary(A, nor(B,C), and(B,C)) => imm: baseImm + 2
+  def : xxevalPattern<vt, 
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 2)>;
+  // Ternary(A, eqv(B,C), and(B,C)) => imm: baseImm + 3
+  def : xxevalPattern<vt, 
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor, 1>.opPat, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 3)>;
+  // Ternary(A, not(C), and(B,C)) => imm: baseImm + 4
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalUnaryNot<vt>.opC, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 4)>;
+  // Ternary(A, not(B), and(B,C)) => imm: baseImm + 6
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalUnaryNot<vt>.opB, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 6)>;
+}
+
+multiclass XXEvalVSelectWithXB<ValueType vt, bits<8> baseImm>{
+  // Multiclass for Ternary(A, X, B) style patterns
+  // Ternary(A, and(B,C), B) => imm: baseImm
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and>.opPat, vt:$vB), 
+        baseImm>;
+  // Ternary(A, nor(B,C), B) => imm: baseImm + 7
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, vt:$vB), 
+        !add(baseImm, 7)>;
+  // Ternary(A, eqv(B,C), B) => imm: baseImm + 8
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor, 1>.opPat, vt:$vB), 
+        !add(baseImm, 8)>;
+  // Ternary(A, not(C), B) => imm: baseImm + 9
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalUnaryNot<vt>.opC, vt:$vB), 
+        !add(baseImm, 9)>;
+  // Ternary(A, nand(B,C), B) => imm: baseImm + 13
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and, 1>.opPat, vt:$vB), 
+        !add(baseImm, 13)>;
+}
+
+multiclass XXEvalVSelectWithXC<ValueType vt, bits<8> baseImm>{
+  // Multiclass for Ternary(A, X, C) style patterns
+  // Ternary(A, and(B,C), C) => imm: baseImm
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and>.opPat, vt:$vC), 
+        baseImm>;
+  // Ternary(A, nor(B,C), C) => imm: baseImm + 7
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, vt:$vC), 
+        !add(baseImm, 7)>;
+  // Ternary(A, eqv(B,C), C) => imm: baseImm + 8
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor, 1>.opPat, vt:$vC), 
+        !add(baseImm, 8)>;
+  // Ternary(A, nand(B,C), C) => imm: baseImm + 13
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and, 1>.opPat, vt:$vC), 
+        !add(baseImm, 13)>;
+}
+
+multiclass XXEvalVSelectWithXXor<ValueType vt, bits<8> baseImm>{
+  // Multiclass for Ternary(A, X, xor(B,C)) style patterns
+  // Ternary(A, and(B,C), xor(B,C)) => imm: baseImm
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and>.opPat, XXEvalBinaryPattern<vt, xor>.opPat), 
+        baseImm>;
+  // Ternary(A, B, xor(B,C)) => imm: baseImm + 2
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, vt:$vB, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 2)>;
+  // Ternary(A, C, xor(B,C)) => imm: baseImm + 4
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, vt:$vC, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 4)>;
+  // Ternary(A, or(B,C), xor(B,C)) => imm: baseImm + 6
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or>.opPat, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 6)>;
+  // Ternary(A, nor(B,C), xor(B,C)) => imm: baseImm + 7
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 7)>; 
+}
 
 let Predicates = [PrefixInstrs, HasP10Vector] in {
   let AddedComplexity = 400 in {
@@ -2192,83 +2317,96 @@ let Predicates = [PrefixInstrs, HasP10Vector] in {
     // Anonymous patterns for XXEVAL
     // AND
     // and(A, B, C)
-    def : xxevalPattern<(and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 1>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 1>;
     // and(A, xor(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 6>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 6>;
     // and(A, or(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 7>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 7>;
     // and(A, nor(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 8>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 8>;
     // and(A, eqv(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 9>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 9>;
     // and(A, nand(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 14>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 14>;
 
     // NAND
     // nand(A, B, C)
-    def : xxevalPattern<(vnot (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC))),
+    def : xxevalPattern<v4i32, (vnot (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC))),
                          !sub(255, 1)>;
     // nand(A, xor(B, C))
-    def : xxevalPattern<(vnot (and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))),
+    def : xxevalPattern<v4i32, (vnot (and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))),
                          !sub(255, 6)>;
     // nand(A, or(B, C))
-    def : xxevalPattern<(vnot (and v4i32:$vA, (or v4i32:$vB, v4i32:$vC))),
+    def : xxevalPattern<v4i32, (vnot (and v4i32:$vA, (or v4i32:$vB, v4i32:$vC))),
                          !sub(255, 7)>;
     // nand(A, nor(B, C))
-    def : xxevalPattern<(or (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)),
                          !sub(255, 8)>;
     // nand(A, eqv(B, C))
-    def : xxevalPattern<(or (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)),
                          !sub(255, 9)>;
     // nand(A, nand(B, C))
-    def : xxevalPattern<(or (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)),
                          !sub(255, 14)>;
 
     // EQV
     // (eqv A, B, C)
-    def : xxevalPattern<(or (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)),
                             (vnot (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC)))),
                          150>;
     // (eqv A, (and B, C))
-    def : xxevalPattern<(vnot (xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 225>;
+    def : xxevalPattern<v4i32, (vnot (xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 225>;
     // (eqv A, (or B, C))
-    def : xxevalPattern<(vnot (xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 135>;
+    def : xxevalPattern<v4i32, (vnot (xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 135>;
 
     // NOR
     // (nor A, B, C)
-    def : xxevalPattern<(vnot (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 128>;
+    def : xxevalPattern<v4i32, (vnot (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 128>;
     // (nor A, (and B, C))
-    def : xxevalPattern<(vnot (or v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 224>;
+    def : xxevalPattern<v4i32, (vnot (or v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 224>;
     // (nor A, (eqv B, C))
-    def : xxevalPattern<(and (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)), 96>;
+    def : xxevalPattern<v4i32, (and (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)), 96>;
     // (nor A, (nand B, C))
-    def : xxevalPattern<(and (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)), 16>;
+    def : xxevalPattern<v4i32, (and (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)), 16>;
     // (nor A, (nor B, C))
-    def : xxevalPattern<(and (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)), 112>;
+    def : xxevalPattern<v4i32, (and (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)), 112>;
     // (nor A, (xor B, C))
-    def : xxevalPattern<(vnot (or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))), 144>;
+    def : xxevalPattern<v4i32, (vnot (or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))), 144>;
 
     // OR
     // (or A, B, C)
-    def : xxevalPattern<(or v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 127>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 127>;
     // (or A, (and B, C))
-    def : xxevalPattern<(or v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 31>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 31>;
     // (or A, (eqv B, C))
-    def : xxevalPattern<(or v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 159>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 159>;
     // (or A, (nand B, C))
-    def : xxevalPattern<(or v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 239>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 239>;
     // (or A, (nor B, C))
-    def : xxevalPattern<(or v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 143>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 143>;
     // (or A, (xor B, C))
-    def : xxevalPattern<(or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 111>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 111>;
 
     // XOR
     // (xor A, B, C)
-    def : xxevalPattern<(xor v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 105>;
+    def : xxevalPattern<v4i32, (xor v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 105>;
     // (xor A, (and B, C))
-    def : xxevalPattern<(xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 30>;
+    def : xxevalPattern<v4i32, (xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 30>;
     // (xor A, (or B, C))
-    def : xxevalPattern<(xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 120>;
+    def : xxevalPattern<v4i32, (xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 120>;
+
+    // Ternary operation support with the xxeval instruction.
+    defm : XXEvalVSelectWithXAnd<v4i32, 22>;
+    defm : XXEvalVSelectWithXAnd<v2i64, 22>;
+
+    defm : XXEvalVSelectWithXB<v4i32, 49>;
+    defm : XXEvalVSelectWithXB<v2i64, 49>;
+
+    defm : XXEvalVSelectWithXC<v4i32, 81>;
+    defm : XXEvalVSelectWithXC<v2i64, 81>;
+
+    defm : XXEvalVSelectWithXXor<v4i32, 97>;
+    defm : XXEvalVSelectWithXXor<v2i64, 97>;
 
     // Anonymous patterns to select prefixed VSX loads and stores.
     // Load / Store f128
diff --git a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll
new file mode 100644
index 0000000000000..b30617469c901
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll
@@ -0,0 +1,134 @@
+; Test file to verify the emission of xxeval instructions when ternary operators are used.
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; Function to test ternary(A, xor(B, C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_xor_BC_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 22
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_xor_BC_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <4 x i32> %B, %C
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %xor, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, xor(B, C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_xor_BC_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 22
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_xor_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <2 x i64> %B, %C
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %xor, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, nor(B, C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_nor_BC_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 24
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_nor_BC_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %or = or <4 x i32> %B, %C
+  %nor = xor <4 x i32> %or, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector NOR operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %nor, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, nor(B, C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_nor_BC_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 24
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_nor_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %or = or <2 x i64> %B, %C
+  %nor = xor <2 x i64> %or, <i64 -1, i64 -1>  ; Vector NOR operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %nor, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, eqv(B, C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_eqv_BC_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 25
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_eqv_BC_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <4 x i32> %B, %C
+  %eqv = xor <4 x i32> %xor, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector eqv operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %eqv, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, eqv(B, C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_eqv_BC_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 25
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_eqv_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <2 x i64> %B, %C
+  %eqv = xor <2 x i64> %xor, <i64 -1, i64 -1>  ; Vector eqv operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %eqv, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, not(C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_not_C_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 26
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_not_C_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <4 x i32> %C, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector not operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %not, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, not(C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_not_C_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 26
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_not_C_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <2 x i64> %C, <i64 -1, i64 -1>  ; Vector not operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %not, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, not(B), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_not_B_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 28
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_not_B_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector not operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %not, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, not(B), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_not_B_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 28
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_not_B_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <2 x i64> %B, <i64 -1, i64 -1>  ; Vector not operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %not, <2 x i64> %and
+  ret <2 x i64> %res
+}
\ No newline at end of file
diff --git a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
new file mode 100644
index 0000000000000..7ab5fa7d62688
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
@@ -0,0 +1,126 @@
+; Test file to verify the emission of xxeval instructions when ternary operators are used.
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; Function to test ternary(A, and(B, C), B) for <4 x i32>
+; CHECK-LABEL: ternary_A_and_BC_B_4x32
+; CHECK: xxeval v2, v2, v3, v4, 49
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_and_BC_B_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %and, <4 x i32> %B
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, and(B, C), B) for <2 x i64>
+; CHECK-LABEL: ternary_A_and_BC_B_2x64
+; CHECK: xxeval v2, v2, v3, v4, 49
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_and_BC_B_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %and, <2 x i64> %B
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, nor(B, C), B) for <4 x i32>
+; CHECK-LABEL: ternary_A_...
[truncated]

@tonykuttai
Copy link
Contributor Author

@tonykuttai tonykuttai changed the title [P10][XXEVAL] Exploit xxeval instruction for cases of the ternary(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms. [PowerPC10][XXEVAL] Exploit xxeval instruction for cases of the ternary(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms. May 28, 2025
@tonykuttai tonykuttai force-pushed the tvarghese/xxeval branch 5 times, most recently from fcebb02 to 604d89b Compare June 3, 2025 16:23
@tonykuttai tonykuttai requested a review from lei137 June 3, 2025 16:25
@tonykuttai
Copy link
Contributor Author

tonykuttai commented Jun 3, 2025

Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure about all the class def that just contain a dag def.
Maybe we can have 1 XXEvalUnaryOp class with the different dag defined with descriptive names and have all the xxevalpattern classes derive from that and use the appropriate dag from there as needed?

@tonykuttai tonykuttai requested a review from lei137 June 11, 2025 08:56
@tonykuttai tonykuttai changed the title [PowerPC10][XXEVAL] Exploit xxeval instruction for cases of the ternary(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms. [PowerPC][XXEVAL] Exploit xxeval instruction for cases of the ternary(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms. Jun 11, 2025
@tonykuttai
Copy link
Contributor Author

Unsure about all the class def that just contain a dag def. Maybe we can have 1 XXEvalUnaryOp class with the different dag defined with descriptive names and have all the xxevalpattern classes derive from that and use the appropriate dag from there as needed?

Two classes defined for getting the Unary and the BinaryPattern.

  • XXEvalUnaryPattern
  • XXEvalBinaryPattern

For getting a simple dag of vt:$vB or vt:$vC, it would be simpler to do it without the use of a wrapper class. and hence I have excluded them from the XXEvalUnaryPattern.

@tonykuttai tonykuttai changed the title [PowerPC][XXEVAL] Exploit xxeval instruction for cases of the ternary(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms. [PowerPC] Exploit xxeval instruction for ternary patterns - part 1 Jun 12, 2025
@tonykuttai tonykuttai force-pushed the tvarghese/xxeval branch 2 times, most recently from 9921b8c to 3ebc442 Compare June 12, 2025 05:29
@tonykuttai
Copy link
Contributor Author

Rebased the PR to the latest main branch. (NFC pre-commit patches already merged).

@tonykuttai tonykuttai force-pushed the tvarghese/xxeval branch 2 times, most recently from 51c2fda to 4c35331 Compare July 21, 2025 16:44
@tonykuttai tonykuttai requested review from amy-kwan, lei137 and kamaub July 21, 2025 16:58
@tonykuttai tonykuttai force-pushed the tvarghese/xxeval branch 5 times, most recently from 9653653 to 3e7895b Compare July 21, 2025 18:46
@tonykuttai tonykuttai requested a review from lei137 July 22, 2025 18:31
@tonykuttai
Copy link
Contributor Author

Do we need to add a clang test for this?

Input file t.cpp

#include <altivec.h>

// Function to test ternary(A, xor(B, C), and(B, C)) for <16 x i8>
vector unsigned char ternary_A_xor_BC_and_BC_16x8(vector bool char a, vector unsigned char b, vector unsigned char c) {
    // Use Clang's ternary operator on vectors - this should generate vselect
    vector unsigned char xor_bc = vec_xor(b, c);
    vector unsigned char and_bc = vec_and(b, c);
    
    // Convert bool vector to mask and use ternary operator
    return a ? and_bc : xor_bc;
}

$LLVM_BUILD/bin/clang++ -mcpu=pwr10 -maltivec t.cpp -O3 -S -o t.s

Can generate our desired asm code:

	.abiversion 2
	.file	"xor-and.cpp"
	.text
	.globl	_Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_ # -- Begin function _Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_
	.p2align	4
	.type	_Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_,@function
_Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_: # @_Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_
.Lfunc_begin0:
	.cfi_startproc
# %bb.0:                                # %entry
	xxlxor 37, 37, 37
	vcmpequb 2, 2, 5
	xxeval 34, 34, 36, 35, 22   ==============> xxeval matching the ternary operation
	blr
	.long	0
	.quad	0
.Lfunc_end0:
	.size	_Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_, .Lfunc_end0-.Lfunc_begin0
	.cfi_endproc
                                        # -- End function
	.ident	"clang version 21.0.0git"
	.section	".note.GNU-stack","",@progbits
	.addrsig

Note:

  • return a ? and_bc : xor_bc; is inverted as opposed to the expected a ? xor_bc : and_bc; which this change supports.
  • It is because I was getting the inverted logic in the IR for return a ? xor_bc : and_bc; as shown:
define dso_local noundef <16 x i8> @_Z28ternary_A_xor_BC_and_BC_16x8Dv16_bDv16_hS0_(<16 x i8> noundef %a, <16 x i8> noundef %b, <16 x i8> noundef %c) local_unnamed_addr #0 {
entry:
  %xor.i = xor <16 x i8> %c, %b
  %and.i = and <16 x i8> %c, %b
  %vector_cond.not = icmp eq <16 x i8> %a, zeroinitializer
  %vector_select = select <16 x i1> %vector_cond.not, <16 x i8> %and.i, <16 x i8> %xor.i
  ret <16 x i8> %vector_select
}

@lei137
Copy link
Contributor

lei137 commented Jul 23, 2025

No need for clang test as we are not adding any clang code.
Please update your description to be more concise. Remember your PR description will be part of git log and we are only looking for summary of the patch. Details of implementation should either be in the associated issue or if there is none as comments to this PR instread. Thx!

@tonykuttai tonykuttai changed the title [PowerPC] Exploit xxeval instruction for ternary patterns - part 1 [PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C)) Jul 24, 2025
@tonykuttai
Copy link
Contributor Author

No need for clang test as we are not adding any clang code. Please update your description to be more concise. Remember your PR description will be part of git log and we are only looking for summary of the patch. Details of implementation should either be in the associated issue or if there is none as comments to this PR instread. Thx!

Noted. Updated the description. Thanx.

@lei137
Copy link
Contributor

lei137 commented Jul 24, 2025

I took the liberty of adding a few more tweaks to your description. Hope that is okay.

@tonykuttai tonykuttai force-pushed the tvarghese/xxeval branch 5 times, most recently from 8055f9b to 53a9874 Compare July 25, 2025 17:17
Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank-you for refactoring!

@tonykuttai tonykuttai merged commit 59c3fe6 into llvm:main Jul 29, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants