Skip to content

[RISCV] Transform fcmp to is.fpclass #120242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

futog
Copy link
Contributor

@futog futog commented Dec 17, 2024

The instcombine pass transforms some is.fpclass intrinsics into fcpm calls. If a given floating point extension is not available (F/D/Zfinx/Zfinx/Zfh/Zfhmin), these fcmp calls are lowered to libcalls.

In these cases, custom lowering of the is.fpclass intrinsics in the back-end generates more efficient code.

In the riscv-codegenprepare pass, these fcmp calls are converted back to is.fpclass intrinsics.

The `instcombine` pass transforms some `is.fpclass` intrinsics into
`fcpm` calls. If a given floating point extension is not available
(F/D/Zfinx/Zfinx/Zfh/Zfhmin), these `fcmp` calls are lowered to libcalls.

In these cases, custom lowering of the `is.fpclass` intrinsics in the
back-end generates more efficient code.

In the `riscv-codegenprepare` pass, these `fcmp` calls are converted
back to `is.fpclass` intrinsics.
@llvmbot
Copy link
Member

llvmbot commented Dec 17, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Gergely Futo (futog)

Changes

The instcombine pass transforms some is.fpclass intrinsics into fcpm calls. If a given floating point extension is not available (F/D/Zfinx/Zfinx/Zfh/Zfhmin), these fcmp calls are lowered to libcalls.

In these cases, custom lowering of the is.fpclass intrinsics in the back-end generates more efficient code.

In the riscv-codegenprepare pass, these fcmp calls are converted back to is.fpclass intrinsics.


Patch is 51.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/120242.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp (+39)
  • (added) llvm/test/CodeGen/RISCV/is-fpclass-f32.ll (+1097)
  • (added) llvm/test/CodeGen/RISCV/is-fpclass-f64.ll (+68)
  • (added) llvm/test/CodeGen/RISCV/riscv-codegenprepare-fpclass.ll (+307)
diff --git a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
index 5be5345cca73a9..9bee2ff2590774 100644
--- a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
+++ b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
@@ -20,11 +20,13 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/InstVisitor.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
+#include "llvm/Transforms/Utils/Local.h"
 
 using namespace llvm;
 
@@ -58,6 +60,7 @@ class RISCVCodeGenPrepare : public FunctionPass,
   bool visitAnd(BinaryOperator &BO);
   bool visitIntrinsicInst(IntrinsicInst &I);
   bool expandVPStrideLoad(IntrinsicInst &I);
+  bool visitFCmpInst(FCmpInst &I);
 };
 
 } // end anonymous namespace
@@ -196,6 +199,42 @@ bool RISCVCodeGenPrepare::expandVPStrideLoad(IntrinsicInst &II) {
   return true;
 }
 
+// The 'fcmp uno/ord/oeq/une/ueq/one/ogt/oge/olt/ole x, 0.0' instructions are
+// equivalent to an FP class test. If the fcmp instruction would be custom
+// lowered or lowered to a libcall, use the is.fpclass intrinsic instead, which
+// is lowered by the back-end without a libcall.
+//
+// This basically reverts the transformations of
+// InstCombinerImpl::foldIntrinsicIsFPClass.
+bool RISCVCodeGenPrepare::visitFCmpInst(FCmpInst &Fcmp) {
+  const auto *TLI = ST->getTargetLowering();
+  const EVT VT = TLI->getValueType(*DL, Fcmp.getOperand(0)->getType());
+  const int ISDOpcode = TLI->InstructionOpcodeToISD(Fcmp.getOpcode());
+
+  auto LegalizeTypeAction = TLI->getTypeAction(Fcmp.getContext(), VT);
+  auto OperationAction = TLI->getOperationAction(ISDOpcode, VT);
+  if ((LegalizeTypeAction != TargetLoweringBase::TypeSoftenFloat &&
+       LegalizeTypeAction != TargetLoweringBase::TypeSoftPromoteHalf) ||
+      OperationAction == TargetLowering::Custom)
+    return false;
+
+  auto [ClassVal, ClassTest] =
+      fcmpToClassTest(Fcmp.getPredicate(), *Fcmp.getParent()->getParent(),
+                      Fcmp.getOperand(0), Fcmp.getOperand(1));
+
+  // FIXME: For some conditions (e.g ole, olt, oge, ogt) the output is quite
+  //        verbose compared to the libcall. Should we do the tranformation
+  //        only if we are optimizing for speed?
+  if (!ClassVal)
+    return false;
+
+  IRBuilder<> Builder(&Fcmp);
+  Value *IsFPClass = Builder.createIsFPClass(ClassVal, ClassTest);
+  Fcmp.replaceAllUsesWith(IsFPClass);
+  RecursivelyDeleteTriviallyDeadInstructions(&Fcmp);
+  return true;
+}
+
 bool RISCVCodeGenPrepare::runOnFunction(Function &F) {
   if (skipFunction(F))
     return false;
diff --git a/llvm/test/CodeGen/RISCV/is-fpclass-f32.ll b/llvm/test/CodeGen/RISCV/is-fpclass-f32.ll
new file mode 100644
index 00000000000000..dd918ba7a1a8d7
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/is-fpclass-f32.ll
@@ -0,0 +1,1097 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 -mattr=+f -verify-machineinstrs -target-abi=ilp32f < %s \
+; RUN:   | FileCheck -check-prefix=RV32IF %s
+; RUN: llc -mtriple=riscv32 -mattr=+zfinx -verify-machineinstrs -target-abi=ilp32 < %s \
+; RUN:   | FileCheck -check-prefix=RV32IZFINX %s
+; RUN: llc -mtriple=riscv32 -mattr=+d -verify-machineinstrs -target-abi=ilp32f < %s \
+; RUN:   | FileCheck -check-prefix=RV32IF %s
+; RUN: llc -mtriple=riscv64 -mattr=+f -verify-machineinstrs -target-abi=lp64f < %s \
+; RUN:   | FileCheck -check-prefix=RV64IF %s
+; RUN: llc -mtriple=riscv64 -mattr=+zfinx -verify-machineinstrs -target-abi=lp64 < %s \
+; RUN:   | FileCheck -check-prefix=RV64IZFINX %s
+; RUN: llc -mtriple=riscv64 -mattr=+d -verify-machineinstrs -target-abi=lp64d < %s \
+; RUN:   | FileCheck -check-prefix=RV64IF %s
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV32I %s
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV64I %s
+
+declare i1 @llvm.is.fpclass.f32(float, i32)
+
+define i1 @fpclass(float %x) {
+; RV32IF-LABEL: fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 927
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 927
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 927
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 927
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a0, 1
+; RV32I-NEXT:    lui a2, 2048
+; RV32I-NEXT:    slti a0, a0, 0
+; RV32I-NEXT:    lui a3, 522240
+; RV32I-NEXT:    lui a4, 1046528
+; RV32I-NEXT:    srli a1, a1, 1
+; RV32I-NEXT:    addi a2, a2, -1
+; RV32I-NEXT:    addi a5, a1, -1
+; RV32I-NEXT:    sltu a2, a5, a2
+; RV32I-NEXT:    xor a5, a1, a3
+; RV32I-NEXT:    slt a3, a3, a1
+; RV32I-NEXT:    add a4, a1, a4
+; RV32I-NEXT:    seqz a1, a1
+; RV32I-NEXT:    seqz a5, a5
+; RV32I-NEXT:    srli a4, a4, 24
+; RV32I-NEXT:    and a2, a2, a0
+; RV32I-NEXT:    or a1, a1, a5
+; RV32I-NEXT:    sltiu a4, a4, 127
+; RV32I-NEXT:    or a1, a1, a2
+; RV32I-NEXT:    or a1, a1, a3
+; RV32I-NEXT:    and a0, a4, a0
+; RV32I-NEXT:    or a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a1, a0
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a2, 2048
+; RV64I-NEXT:    lui a3, 522240
+; RV64I-NEXT:    lui a4, 1046528
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    addiw a2, a2, -1
+; RV64I-NEXT:    slti a1, a1, 0
+; RV64I-NEXT:    addi a5, a0, -1
+; RV64I-NEXT:    sltu a2, a5, a2
+; RV64I-NEXT:    xor a5, a0, a3
+; RV64I-NEXT:    slt a3, a3, a0
+; RV64I-NEXT:    add a4, a0, a4
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    seqz a5, a5
+; RV64I-NEXT:    srliw a4, a4, 24
+; RV64I-NEXT:    and a2, a2, a1
+; RV64I-NEXT:    or a0, a0, a5
+; RV64I-NEXT:    sltiu a4, a4, 127
+; RV64I-NEXT:    or a0, a0, a2
+; RV64I-NEXT:    or a0, a0, a3
+; RV64I-NEXT:    and a1, a4, a1
+; RV64I-NEXT:    or a0, a0, a1
+; RV64I-NEXT:    ret
+  %cmp = call i1 @llvm.is.fpclass.f32(float %x, i32 639)
+  ret i1 %cmp
+}
+
+define i1 @is_nan_fpclass(float %x) {
+; RV32IF-LABEL: is_nan_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 768
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_nan_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 768
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_nan_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 768
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_nan_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 768
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_nan_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    slt a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_nan_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    slt a0, a1, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 3)  ; nan
+  ret i1 %1
+}
+
+define i1 @is_qnan_fpclass(float %x) {
+; RV32IF-LABEL: is_qnan_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    srli a0, a0, 9
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_qnan_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    srli a0, a0, 9
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_qnan_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    srli a0, a0, 9
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_qnan_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    srli a0, a0, 9
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_qnan_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    lui a1, 523264
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    addi a1, a1, -1
+; RV32I-NEXT:    slt a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_qnan_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a1, 523264
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    addiw a1, a1, -1
+; RV64I-NEXT:    slt a0, a1, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 2)  ; qnan
+  ret i1 %1
+}
+
+define i1 @is_snan_fpclass(float %x) {
+; RV32IF-LABEL: is_snan_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    slli a0, a0, 23
+; RV32IF-NEXT:    srli a0, a0, 31
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_snan_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    slli a0, a0, 23
+; RV32IZFINX-NEXT:    srli a0, a0, 31
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_snan_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    slli a0, a0, 55
+; RV64IF-NEXT:    srli a0, a0, 63
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_snan_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    slli a0, a0, 55
+; RV64IZFINX-NEXT:    srli a0, a0, 63
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_snan_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    lui a1, 523264
+; RV32I-NEXT:    lui a2, 522240
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    slt a1, a0, a1
+; RV32I-NEXT:    slt a0, a2, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_snan_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a1, 523264
+; RV64I-NEXT:    lui a2, 522240
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    slt a1, a0, a1
+; RV64I-NEXT:    slt a0, a2, a0
+; RV64I-NEXT:    and a0, a0, a1
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 1)  ; snan
+  ret i1 %1
+}
+
+define i1 @is_inf_fpclass(float %x) {
+; RV32IF-LABEL: is_inf_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 129
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_inf_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 129
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_inf_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 129
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_inf_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 129
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_inf_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    xor a0, a0, a1
+; RV32I-NEXT:    seqz a0, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_inf_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    xor a0, a0, a1
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 516)  ; 0x204 = "inf"
+  ret i1 %1
+}
+
+define i1 @is_posinf_fpclass(float %x) {
+; RV32IF-LABEL: is_posinf_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    slli a0, a0, 24
+; RV32IF-NEXT:    srli a0, a0, 31
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_posinf_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    slli a0, a0, 24
+; RV32IZFINX-NEXT:    srli a0, a0, 31
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_posinf_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    slli a0, a0, 56
+; RV64IF-NEXT:    srli a0, a0, 63
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_posinf_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    slli a0, a0, 56
+; RV64IZFINX-NEXT:    srli a0, a0, 63
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_posinf_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    xor a0, a0, a1
+; RV32I-NEXT:    seqz a0, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_posinf_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a0, a0
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    xor a0, a0, a1
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 512)  ; 0x200 = "+inf"
+  ret i1 %1
+}
+
+define i1 @is_neginf_fpclass(float %x) {
+; RV32IF-LABEL: is_neginf_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 1
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_neginf_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 1
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_neginf_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 1
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_neginf_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 1
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_neginf_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    lui a1, 1046528
+; RV32I-NEXT:    xor a0, a0, a1
+; RV32I-NEXT:    seqz a0, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_neginf_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a0, a0
+; RV64I-NEXT:    lui a1, 1046528
+; RV64I-NEXT:    xor a0, a0, a1
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 4)  ; "-inf"
+  ret i1 %1
+}
+
+define i1 @is_finite_fpclass(float %x) {
+; RV32IF-LABEL: is_finite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 126
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_finite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 126
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_finite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 126
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_finite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 126
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_finite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    slt a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_finite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    slt a0, a0, a1
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 504)  ; 0x1f8 = "finite"
+  ret i1 %1
+}
+
+define i1 @is_posfinite_fpclass(float %x) {
+; RV32IF-LABEL: is_posfinite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 112
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_posfinite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 112
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_posfinite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 112
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_posfinite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 112
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_posfinite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    srli a0, a0, 23
+; RV32I-NEXT:    sltiu a0, a0, 255
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_posfinite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    srliw a0, a0, 23
+; RV64I-NEXT:    sltiu a0, a0, 255
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 448)  ; 0x1c0 = "+finite"
+  ret i1 %1
+}
+
+define i1 @is_negfinite_fpclass(float %x) {
+; RV32IF-LABEL: is_negfinite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 14
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_negfinite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 14
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_negfinite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 14
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_negfinite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 14
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_negfinite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a0, 1
+; RV32I-NEXT:    lui a2, 522240
+; RV32I-NEXT:    srli a1, a1, 1
+; RV32I-NEXT:    slt a1, a1, a2
+; RV32I-NEXT:    slti a0, a0, 0
+; RV32I-NEXT:    and a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_negfinite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a1, a0
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a2, 522240
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    slt a0, a0, a2
+; RV64I-NEXT:    slti a1, a1, 0
+; RV64I-NEXT:    and a0, a0, a1
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 56)  ; 0x38 = "-finite"
+  ret i1 %1
+}
+
+define i1 @is_notfinite_fpclass(float %x) {
+; RV32IF-LABEL: is_notfinite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 897
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_notfinite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 897
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_notfinite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 897
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_notfinite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 897
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_notfinite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    addi a1, a1, -1
+; RV32I-NEXT:    slt a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_notfinite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    addiw a1, a1, -1
+; RV64I-NEXT:    slt a0, a1, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 519)  ; ox207 = "inf|nan"
+  ret i1 %1
+}
+
+define ...
[truncated]

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 17b3dd03a05dfa938aacd57027189271a62e2fda 0dff7fbbad04e4d711c9beb84e227c81137f2c5d --extensions cpp -- llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
index 9bee2ff259..0416f315fe 100644
--- a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
+++ b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
@@ -20,8 +20,8 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/InstVisitor.h"
+#include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/InitializePasses.h"

@dtcxzyw dtcxzyw requested review from arsenm and topperc December 17, 2024 15:09
Comment on lines +202 to +203
// The 'fcmp uno/ord/oeq/une/ueq/one/ogt/oge/olt/ole x, 0.0' instructions are
// equivalent to an FP class test. If the fcmp instruction would be custom
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true depending on denormal handling and fp exceptions

//
// This basically reverts the transformations of
// InstCombinerImpl::foldIntrinsicIsFPClass.
bool RISCVCodeGenPrepare::visitFCmpInst(FCmpInst &Fcmp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeGenPrepare already has this transform

Comment on lines +214 to +219
auto LegalizeTypeAction = TLI->getTypeAction(Fcmp.getContext(), VT);
auto OperationAction = TLI->getOperationAction(ISDOpcode, VT);
if ((LegalizeTypeAction != TargetLoweringBase::TypeSoftenFloat &&
LegalizeTypeAction != TargetLoweringBase::TypeSoftPromoteHalf) ||
OperationAction == TargetLowering::Custom)
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This level of logic really belongs directly in the legalizer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand. Moving it to the legalizer. @topperc is it ok to implement it there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the logic in CodeGenPrepare uses TargetLoweringBase::isFAbsFree. When I started to implement this, I was wondering if there should be a similar function for FCmp, and the whole thing should go into CodegenPrepare instead. Is this a valid approach? Or the right place to do it is in the legalizer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like having it in codegenprepare in the first place. It really belongs in some combination of DAGCombiner or legalizer, depending on the purpose. The only nice thing is codegenprepare has access to better utilities, like an existing fcmpToClassTest helper and computeKnownFPClass. In principle those should be reimplemented in codegen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Regarding the InstCombinerImpl::foldIntrinsicIsFPClass. For a back-end, where lowering the fcmp is not cheap, why is it profitable to do this transformation? It is done unconditionally as far as I see it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fcmp is a better canonical form. More code will always understand fcmp than is.fpclass.

fcmp is not cheap

This is certainly not universally true, and I would say is not the common case. If the target wants something else, that's for the backend to undo for its preferred form.

@dtcxzyw
Copy link
Member

dtcxzyw commented Dec 17, 2024

FYI I did similar transformation in CodeGenPrepare: #81572.

@futog
Copy link
Contributor Author

futog commented Dec 17, 2024

FYI I did similar transformation in CodeGenPrepare: #81572.

In CodeGenPrepare it is only for fcInf || fcInf | fcNan right? Aren't we considering the rest of the classifications on purpose?

@dtcxzyw
Copy link
Member

dtcxzyw commented Dec 17, 2024

FYI I did similar transformation in CodeGenPrepare: #81572.

In CodeGenPrepare it is only for fcInf || fcInf | fcNan right? Aren't we considering the rest of the classifications on purpose?

I mean you can handle your motivating case in CGP.

@futog futog closed this Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants