Skip to content

[InstSimplify] Generalize simplifyAndOrOfFCmps to handle fabs #116590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 19, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Nov 18, 2024

This patch generalizes #81027 to handle pattern and/or (fcmp ord/uno X, 0), (fcmp pred fabs(X), Y).
Alive2: https://alive2.llvm.org/ce/z/tsgUrz
The correctness is straightforward because fcmp ord/uno X, 0.0 is equivalent to fcmp ord/uno fabs(X), 0.0. We may generalize it to handle fneg as well.

Address comment #116065 (review)

@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch generalizes #81027 to handle pattern and/or (fcmp ord/uno X, 0), (fcmp pred fabs(X), Y).
Alive2: https://alive2.llvm.org/ce/z/tsgUrz
The correctness is straightforward because fcmp ord/uno X, 0.0 is equivalent to fcmp ord/uno fabs(X), 0.0. We may generalize it to handle fneg as well.


Full diff: https://github.com/llvm/llvm-project/pull/116590.diff

3 Files Affected:

  • (modified) llvm/lib/Analysis/InstructionSimplify.cpp (+14-10)
  • (modified) llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll (+4-2)
  • (modified) llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll (+65)
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 93b601b22c3a39..01b0a089aab718 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -1857,27 +1857,31 @@ static Value *simplifyAndOrOfFCmps(const SimplifyQuery &Q, FCmpInst *LHS,
     return nullptr;
 
   FCmpInst::Predicate PredL = LHS->getPredicate(), PredR = RHS->getPredicate();
+  auto AbsOrSelfLHS0 = m_CombineOr(m_Specific(LHS0), m_FAbs(m_Specific(LHS0)));
   if ((PredL == FCmpInst::FCMP_ORD || PredL == FCmpInst::FCMP_UNO) &&
       ((FCmpInst::isOrdered(PredR) && IsAnd) ||
        (FCmpInst::isUnordered(PredR) && !IsAnd))) {
-    // (fcmp ord X, 0) & (fcmp o** X, Y) --> fcmp o** X, Y
-    // (fcmp uno X, 0) & (fcmp o** X, Y) --> false
-    // (fcmp uno X, 0) | (fcmp u** X, Y) --> fcmp u** X, Y
-    // (fcmp ord X, 0) | (fcmp u** X, Y) --> true
-    if ((LHS0 == RHS0 || LHS0 == RHS1) && match(LHS1, m_PosZeroFP()))
+    // (fcmp ord X, 0) & (fcmp o** X/abs(X), Y) --> fcmp o** X/abs(X), Y
+    // (fcmp uno X, 0) & (fcmp o** X/abs(X), Y) --> false
+    // (fcmp uno X, 0) | (fcmp u** X/abs(X), Y) --> fcmp u** X/abs(X), Y
+    // (fcmp ord X, 0) | (fcmp u** X/abs(X), Y) --> true
+    if ((match(RHS0, AbsOrSelfLHS0) || match(RHS1, AbsOrSelfLHS0)) &&
+        match(LHS1, m_PosZeroFP()))
       return FCmpInst::isOrdered(PredL) == FCmpInst::isOrdered(PredR)
                  ? static_cast<Value *>(RHS)
                  : ConstantInt::getBool(LHS->getType(), !IsAnd);
   }
 
+  auto AbsOrSelfRHS0 = m_CombineOr(m_Specific(RHS0), m_FAbs(m_Specific(RHS0)));
   if ((PredR == FCmpInst::FCMP_ORD || PredR == FCmpInst::FCMP_UNO) &&
       ((FCmpInst::isOrdered(PredL) && IsAnd) ||
        (FCmpInst::isUnordered(PredL) && !IsAnd))) {
-    // (fcmp o** X, Y) & (fcmp ord X, 0) --> fcmp o** X, Y
-    // (fcmp o** X, Y) & (fcmp uno X, 0) --> false
-    // (fcmp u** X, Y) | (fcmp uno X, 0) --> fcmp u** X, Y
-    // (fcmp u** X, Y) | (fcmp ord X, 0) --> true
-    if ((RHS0 == LHS0 || RHS0 == LHS1) && match(RHS1, m_PosZeroFP()))
+    // (fcmp o** X/abs(X), Y) & (fcmp ord X, 0) --> fcmp o** X/abs(X), Y
+    // (fcmp o** X/abs(X), Y) & (fcmp uno X, 0) --> false
+    // (fcmp u** X/abs(X), Y) | (fcmp uno X, 0) --> fcmp u** X/abs(X), Y
+    // (fcmp u** X/abs(X), Y) | (fcmp ord X, 0) --> true
+    if ((match(LHS0, AbsOrSelfRHS0) || match(LHS1, AbsOrSelfRHS0)) &&
+        match(RHS1, m_PosZeroFP()))
       return FCmpInst::isOrdered(PredL) == FCmpInst::isOrdered(PredR)
                  ? static_cast<Value *>(LHS)
                  : ConstantInt::getBool(LHS->getType(), !IsAnd);
diff --git a/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll b/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll
index 74a1e318d77ede..9a723e8bc89ff5 100644
--- a/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll
+++ b/llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll
@@ -990,7 +990,8 @@ define i1 @not_isnormalinf_or_inf(half %x) #0 {
 ; -> subnormal | zero | nan
 define i1 @not_isnormalinf_or_uno(half %x) #0 {
 ; CHECK-LABEL: @not_isnormalinf_or_uno(
-; CHECK-NEXT:    [[OR:%.*]] = call i1 @llvm.is.fpclass.f16(half [[X:%.*]], i32 243)
+; CHECK-NEXT:    [[FABS:%.*]] = call half @llvm.fabs.f16(half [[X:%.*]])
+; CHECK-NEXT:    [[OR:%.*]] = fcmp ult half [[FABS]], 0xH0400
 ; CHECK-NEXT:    ret i1 [[OR]]
 ;
   %fabs = call half @llvm.fabs.f16(half %x)
@@ -1003,7 +1004,8 @@ define i1 @not_isnormalinf_or_uno(half %x) #0 {
 ; -> subnormal | zero | nan
 define i1 @not_isnormalinf_or_uno_nofabs(half %x) #0 {
 ; CHECK-LABEL: @not_isnormalinf_or_uno_nofabs(
-; CHECK-NEXT:    [[OR:%.*]] = call i1 @llvm.is.fpclass.f16(half [[X:%.*]], i32 243)
+; CHECK-NEXT:    [[FABS:%.*]] = call half @llvm.fabs.f16(half [[X:%.*]])
+; CHECK-NEXT:    [[OR:%.*]] = fcmp ult half [[FABS]], 0xH0400
 ; CHECK-NEXT:    ret i1 [[OR]]
 ;
   %fabs = call half @llvm.fabs.f16(half %x)
diff --git a/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll b/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll
index 3a8bf53b32cab0..6aa0adb2f0e67b 100644
--- a/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll
+++ b/llvm/test/Transforms/InstSimplify/logic-of-fcmps.ll
@@ -426,3 +426,68 @@ define i1 @olt_implies_olt_fail(float %x, float %y) {
   %ret = and i1 %olt, %olt2
   ret i1 %ret
 }
+
+define i1 @and_ord_olt_abs(float %x, float %y) {
+; CHECK-LABEL: @and_ord_olt_abs(
+; CHECK-NEXT:    [[ABSX:%.*]] = call float @llvm.fabs.f32(float [[X:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[ABSX]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[CMP2]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp olt float %absx, %y
+  %and = and i1 %cmp1, %cmp2
+  ret i1 %and
+}
+
+define i1 @and_ord_olt_abs_commuted1(float %x, float %y) {
+; CHECK-LABEL: @and_ord_olt_abs_commuted1(
+; CHECK-NEXT:    [[ABSX:%.*]] = call float @llvm.fabs.f32(float [[X:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[Y:%.*]], [[ABSX]]
+; CHECK-NEXT:    ret i1 [[CMP2]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp olt float %y, %absx
+  %and = and i1 %cmp1, %cmp2
+  ret i1 %and
+}
+
+define i1 @and_ord_olt_abs_commuted2(float %x, float %y) {
+; CHECK-LABEL: @and_ord_olt_abs_commuted2(
+; CHECK-NEXT:    [[ABSX:%.*]] = call float @llvm.fabs.f32(float [[X:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[ABSX]], [[Y:%.*]]
+; CHECK-NEXT:    ret i1 [[CMP2]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp olt float %absx, %y
+  %and = and i1 %cmp2, %cmp1
+  ret i1 %and
+}
+
+define i1 @or_ord_ult_abs(float %x, float %y) {
+; CHECK-LABEL: @or_ord_ult_abs(
+; CHECK-NEXT:    ret i1 true
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absx = call float @llvm.fabs.f32(float %x)
+  %cmp2 = fcmp ult float %absx, %y
+  %or = or i1 %cmp1, %cmp2
+  ret i1 %or
+}
+
+define i1 @and_ord_olt_absz(float %x, float %y, float %z) {
+; CHECK-LABEL: @and_ord_olt_absz(
+; CHECK-NEXT:    [[CMP1:%.*]] = fcmp ord float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT:    [[ABSZ:%.*]] = call float @llvm.fabs.f32(float [[Z:%.*]])
+; CHECK-NEXT:    [[CMP2:%.*]] = fcmp olt float [[ABSZ]], [[Y:%.*]]
+; CHECK-NEXT:    [[AND:%.*]] = and i1 [[CMP1]], [[CMP2]]
+; CHECK-NEXT:    ret i1 [[AND]]
+;
+  %cmp1 = fcmp ord float %x, 0.000000e+00
+  %absz = call float @llvm.fabs.f32(float %z)
+  %cmp2 = fcmp olt float %absz, %y
+  %and = and i1 %cmp1, %cmp2
+  ret i1 %and
+}

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeGen/AMDGPU/fp-classify.ll fails

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic nikic added the floating-point Floating-point math label Nov 18, 2024
@dtcxzyw dtcxzyw merged commit 42ed775 into llvm:main Nov 19, 2024
8 of 10 checks passed
@dtcxzyw dtcxzyw deleted the perf/simplify-andor-fcmp-fabs branch November 19, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants