[InstSimplify] Generalize `simplifyAndOrOfFCmps` to handle fabs #116590

dtcxzyw · 2024-11-18T09:11:47Z

This patch generalizes #81027 to handle pattern and/or (fcmp ord/uno X, 0), (fcmp pred fabs(X), Y).
Alive2: https://alive2.llvm.org/ce/z/tsgUrz
The correctness is straightforward because fcmp ord/uno X, 0.0 is equivalent to fcmp ord/uno fabs(X), 0.0. We may generalize it to handle fneg as well.

Address comment #116065 (review)

llvmbot · 2024-11-18T09:12:23Z

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch generalizes #81027 to handle pattern and/or (fcmp ord/uno X, 0), (fcmp pred fabs(X), Y).
Alive2: https://alive2.llvm.org/ce/z/tsgUrz
The correctness is straightforward because fcmp ord/uno X, 0.0 is equivalent to fcmp ord/uno fabs(X), 0.0. We may generalize it to handle fneg as well.

Full diff: https://github.com/llvm/llvm-project/pull/116590.diff

3 Files Affected:

(modified) llvm/lib/Analysis/InstructionSimplify.cpp (+14-10)
(modified) llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll (+4-2)
(modified) llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll (+65)

diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 93b601b22c3a39..01b0a089aab718 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -1857,27 +1857,31 @@ static Value *simplifyAndOrOfFCmps(const SimplifyQuery &Q, FCmpInst *LHS,
     return nullptr;
 
   FCmpInst::Predicate PredL = LHS->getPredicate(), PredR = RHS->getPredicate();
+  auto AbsOrSelfLHS0 = m_CombineOr(m_Specific(LHS0), m_FAbs(m_Specific(LHS0)));
   if ((PredL == FCmpInst::FCMP_ORD || PredL == FCmpInst::FCMP_UNO) &&
       ((FCmpInst::isOrdered(PredR) && IsAnd) ||
        (FCmpInst::isUnordered(PredR) && !IsAnd))) {
-    // (fcmp ord X, 0) & (fcmp o** X, Y) --> fcmp o** X, Y
-    // (fcmp uno X, 0) & (fcmp o** X, Y) --> false
-    // (fcmp uno X, 0) | (fcmp u** X, Y) --> fcmp u** X, Y
-    // (fcmp ord X, 0) | (fcmp u** X, Y) --> true
-    if ((LHS0 == RHS0 || LHS0 == RHS1) && match(LHS1, m_PosZeroFP()))
+    // (fcmp ord X, 0) & (fcmp o** X/abs(X), Y) --> fcmp o** X/abs(X), Y
+    // (fcmp uno X, 0) & (fcmp o** X/abs(X), Y) --> false
+    // (fcmp uno X, 0) | (fcmp u** X/abs(X), Y) --> fcmp u** X/abs(X), Y
+    // (fcmp ord X, 0) | (fcmp u** X/abs(X), Y) --> true
+    if ((match(RHS0, AbsOrSelfLHS0) || match(RHS1, AbsOrSelfLHS0)) &&
+        match(LHS1, m_PosZeroFP()))
       return FCmpInst::isOrdered(PredL) == FCmpInst::isOrdered(PredR)
                  ? static_cast<Value *>(RHS)
                  : ConstantInt::getBool(LHS->getType(), !IsAnd);
   }
 
+  auto AbsOrSelfRHS0 = m_CombineOr(m_Specific(RHS0), m_FAbs(m_Specific(RHS0)));
   if ((PredR == FCmpInst::FCMP_ORD || PredR == FCmpInst::FCMP_UNO) &&
       ((FCmpInst::isOrdered(PredL) && IsAnd) ||
        (FCmpInst::isUnordered(PredL) && !IsAnd))) {
-    // (fcmp o** X, Y) & (fcmp ord X, 0) --> fcmp o** X, Y
-    // (fcmp o** X, Y) & (fcmp uno X, 0) --> false
-    // (fcmp u** X, Y) | (fcmp uno X, 0) --> fcmp u** X, Y
-    // (fcmp u** X, Y) | (fcmp ord X, 0) --> true
-    if ((RHS0 == LHS0 || RHS0 == LHS1) && match(RHS1, m_PosZeroFP()))
+    // (fcmp o** X/abs(X), Y) & (fcmp ord X, 0) --> fcmp o** X/abs(X), Y
+    // (fcmp o** X/abs(X), Y) & (fcmp uno X, 0) --> false
+    // (fcmp u** X/abs(X), Y) | (fcmp uno X, 0) --> fcmp u** X/abs(X), Y
+    // (fcmp u** X/abs(X), Y) | (fcmp ord X, 0) --> true
+    if ((match(LHS0, AbsOrSelfRHS0) || match(LHS1, AbsOrSelfRHS0)) &&
+        match(RHS1, m_PosZeroFP()))
       return FCmpInst::isOrdered(PredL) == FCmpInst::isOrdered(PredR)
                  ? static_cast<Value *>(LHS)
                  : ConstantInt::getBool(LHS->getType(), !IsAnd);
diff --git a/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll b/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll
index 74a1e318d77ede..9a723e8bc89ff5 100644
--- a/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll
+++ b/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll
@@ -990,7 +990,8 @@ define i1 @not_isnormalinf_or_inf(half %x) #0 {
 ; -> subnormal | zero | nan
 define i1 @not_isnormalinf_or_uno(half %x) #0 {
 ; CHECK-LABEL: @not_isnormalinf_or_uno(
-; CHECK-NEXT:    [[OR:%.*]] = call i1 @llvm.is.fpclass.f16(half [[X:%.*]], i32 243)
+; CHECK-NEXT:    [[FABS:%.*]] = call half @llvm.fabs.f16(half [[X:%.*]])
+; CHECK-NEXT:    [[OR:%.*]] = fcmp ult half [[FABS]], 0xH0400
 ; CHECK-NEXT:    ret i1 [[OR]]
 ;
   %fabs = call half @llvm.fabs.f16(half %x)
@@ -1003,7 +1004,8 @@ define i1 @not_isnormalinf_or_uno(half %x) #0 {
 ; -> subnormal | zero | nan
 define i1 @not_isnormalinf_or_uno_nofabs(half %x) #0 {
 ; CHECK-LABEL: @not_isnormalinf_or_uno_nofabs(
-; CHECK-NEXT:    [[OR:%.*]] = call i1 @llvm.is.fpclass.f16(half [[X:%.*]], i32 243)
+; CHECK-NEXT:    [[FABS:%.*]] = call half @llvm.fabs.f16(half [[X:%.*]])
+; CHECK-NEXT:    [[OR:%.*]] = fcmp ult half [[FABS]], 0xH0400
 ; CHECK-NEXT:    ret i1 [[OR]]
 ;
   %fabs = call half @llvm.fabs.f16(half %x)
diff --git a/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll b/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll
index 3a8bf53b32cab0..6aa0adb2f0e67b 100644
--- a/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll
+++ b/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll
@@ -426,3 +426,68 @@ define i1 @olt_implies_olt_fail(float %x, float %y) {
   %ret = and i1 %olt, %olt2
   ret i1 %ret
 }
+
+define i1 @and_ord_olt_abs(float %x, float %y) {
+; CHECK-LABEL: @and_ord_olt_abs(
+; CHECK-NEXT:    [[ABSX:%.*]] = call float @llvm.fabs.f32(float [[X:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[ABSX]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[CMP2]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp olt float %absx, %y
+  %and = and i1 %cmp1, %cmp2
+  ret i1 %and
+}
+
+define i1 @and_ord_olt_abs_commuted1(float %x, float %y) {
+; CHECK-LABEL: @and_ord_olt_abs_commuted1(
+; CHECK-NEXT:    [[ABSX:%.*]] = call float @llvm.fabs.f32(float [[X:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[Y:%.*]], [[ABSX]]
+; CHECK-NEXT:    ret i1 [[CMP2]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp olt float %y, %absx
+  %and = and i1 %cmp1, %cmp2
+  ret i1 %and
+}
+
+define i1 @and_ord_olt_abs_commuted2(float %x, float %y) {
+; CHECK-LABEL: @and_ord_olt_abs_commuted2(
+; CHECK-NEXT:    [[ABSX:%.*]] = call float @llvm.fabs.f32(float [[X:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[ABSX]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[CMP2]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp olt float %absx, %y
+  %and = and i1 %cmp2, %cmp1
+  ret i1 %and
+}
+
+define i1 @or_ord_ult_abs(float %x, float %y) {
+; CHECK-LABEL: @or_ord_ult_abs(
+; CHECK-NEXT:    ret i1 true
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp ult float %absx, %y
+  %or = or i1 %cmp1, %cmp2
+  ret i1 %or
+}
+
+define i1 @and_ord_olt_absz(float %x, float %y, float %z) {
+; CHECK-LABEL: @and_ord_olt_absz(
+; CHECK-NEXT:    [[CMP1:%.*]] = fcmp ord float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT:    [[ABSZ:%.*]] = call float @llvm.fabs.f32(float [[Z:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[ABSZ]], [[Y:%.*]]
+; CHECK-NEXT:    [[AND:%.*]] = and i1 [[CMP1]], [[CMP2]]
+; CHECK-NEXT:    ret i1 [[AND]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absz = call float @llvm.fabs.f32(float %z)
+  %cmp2 = fcmp olt float %absz, %y
+  %and = and i1 %cmp1, %cmp2
+  ret i1 %and
+}

nikic

CodeGen/AMDGPU/fp-classify.ll fails

nikic

LGTM

dtcxzyw added 2 commits November 18, 2024 16:34

[InstSimplify] Add pre-commit tests. NFC.

c83445c

[InstSimplify] Generalize simplifyAndOrOfFCmps to handle fabs

46b91e5

dtcxzyw requested review from arsenm and goldsteinn November 18, 2024 09:11

dtcxzyw requested a review from nikic as a code owner November 18, 2024 09:11

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Nov 18, 2024

This was referenced Nov 18, 2024

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR116590 dtcxzyw/llvm-opt-benchmark#1709

Closed

nikic reviewed Nov 18, 2024

View reviewed changes

[AMDGPU] Fix tests. NFC.

4c0062d

llvmbot added the backend:AMDGPU label Nov 18, 2024

nikic approved these changes Nov 18, 2024

View reviewed changes

nikic added the floating-point Floating-point math label Nov 18, 2024

dtcxzyw merged commit 42ed775 into llvm:main Nov 19, 2024
8 of 10 checks passed

dtcxzyw deleted the perf/simplify-andor-fcmp-fabs branch November 19, 2024 12:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstSimplify] Generalize `simplifyAndOrOfFCmps` to handle fabs #116590

[InstSimplify] Generalize `simplifyAndOrOfFCmps` to handle fabs #116590

dtcxzyw commented Nov 18, 2024 •

edited

Loading

Uh oh!

llvmbot commented Nov 18, 2024 •

edited

Loading

Uh oh!

nikic left a comment

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[InstSimplify] Generalize simplifyAndOrOfFCmps to handle fabs #116590

[InstSimplify] Generalize simplifyAndOrOfFCmps to handle fabs #116590

Conversation

dtcxzyw commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[InstSimplify] Generalize `simplifyAndOrOfFCmps` to handle fabs #116590

[InstSimplify] Generalize `simplifyAndOrOfFCmps` to handle fabs #116590

dtcxzyw commented Nov 18, 2024 •

edited

Loading

llvmbot commented Nov 18, 2024 •

edited

Loading