[RISCV] Transform fcmp to is.fpclass #120242

futog · 2024-12-17T15:02:45Z

The instcombine pass transforms some is.fpclass intrinsics into fcpm calls. If a given floating point extension is not available (F/D/Zfinx/Zfinx/Zfh/Zfhmin), these fcmp calls are lowered to libcalls.

In these cases, custom lowering of the is.fpclass intrinsics in the back-end generates more efficient code.

In the riscv-codegenprepare pass, these fcmp calls are converted back to is.fpclass intrinsics.

The `instcombine` pass transforms some `is.fpclass` intrinsics into `fcpm` calls. If a given floating point extension is not available (F/D/Zfinx/Zfinx/Zfh/Zfhmin), these `fcmp` calls are lowered to libcalls. In these cases, custom lowering of the `is.fpclass` intrinsics in the back-end generates more efficient code. In the `riscv-codegenprepare` pass, these `fcmp` calls are converted back to `is.fpclass` intrinsics.

llvmbot · 2024-12-17T15:03:23Z

@llvm/pr-subscribers-backend-risc-v

Author: Gergely Futo (futog)

Changes

The instcombine pass transforms some is.fpclass intrinsics into fcpm calls. If a given floating point extension is not available (F/D/Zfinx/Zfinx/Zfh/Zfhmin), these fcmp calls are lowered to libcalls.

In these cases, custom lowering of the is.fpclass intrinsics in the back-end generates more efficient code.

In the riscv-codegenprepare pass, these fcmp calls are converted back to is.fpclass intrinsics.

Patch is 51.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/120242.diff

4 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp (+39)
(added) llvm/test/CodeGen/RISCV/is-fpclass-f32.ll (+1097)
(added) llvm/test/CodeGen/RISCV/is-fpclass-f64.ll (+68)
(added) llvm/test/CodeGen/RISCV/riscv-codegenprepare-fpclass.ll (+307)

diff --git a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
index 5be5345cca73a9..9bee2ff2590774 100644
--- a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
+++ b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
@@ -20,11 +20,13 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/InstVisitor.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
+#include "llvm/Transforms/Utils/Local.h"
 
 using namespace llvm;
 
@@ -58,6 +60,7 @@ class RISCVCodeGenPrepare : public FunctionPass,
   bool visitAnd(BinaryOperator &BO);
   bool visitIntrinsicInst(IntrinsicInst &I);
   bool expandVPStrideLoad(IntrinsicInst &I);
+  bool visitFCmpInst(FCmpInst &I);
 };
 
 } // end anonymous namespace
@@ -196,6 +199,42 @@ bool RISCVCodeGenPrepare::expandVPStrideLoad(IntrinsicInst &II) {
   return true;
 }
 
+// The 'fcmp uno/ord/oeq/une/ueq/one/ogt/oge/olt/ole x, 0.0' instructions are
+// equivalent to an FP class test. If the fcmp instruction would be custom
+// lowered or lowered to a libcall, use the is.fpclass intrinsic instead, which
+// is lowered by the back-end without a libcall.
+//
+// This basically reverts the transformations of
+// InstCombinerImpl::foldIntrinsicIsFPClass.
+bool RISCVCodeGenPrepare::visitFCmpInst(FCmpInst &Fcmp) {
+  const auto *TLI = ST->getTargetLowering();
+  const EVT VT = TLI->getValueType(*DL, Fcmp.getOperand(0)->getType());
+  const int ISDOpcode = TLI->InstructionOpcodeToISD(Fcmp.getOpcode());
+
+  auto LegalizeTypeAction = TLI->getTypeAction(Fcmp.getContext(), VT);
+  auto OperationAction = TLI->getOperationAction(ISDOpcode, VT);
+  if ((LegalizeTypeAction != TargetLoweringBase::TypeSoftenFloat &&
+       LegalizeTypeAction != TargetLoweringBase::TypeSoftPromoteHalf) ||
+      OperationAction == TargetLowering::Custom)
+    return false;
+
+  auto [ClassVal, ClassTest] =
+      fcmpToClassTest(Fcmp.getPredicate(), *Fcmp.getParent()->getParent(),
+                      Fcmp.getOperand(0), Fcmp.getOperand(1));
+
+  // FIXME: For some conditions (e.g ole, olt, oge, ogt) the output is quite
+  //        verbose compared to the libcall. Should we do the tranformation
+  //        only if we are optimizing for speed?
+  if (!ClassVal)
+    return false;
+
+  IRBuilder<> Builder(&Fcmp);
+  Value *IsFPClass = Builder.createIsFPClass(ClassVal, ClassTest);
+  Fcmp.replaceAllUsesWith(IsFPClass);
+  RecursivelyDeleteTriviallyDeadInstructions(&Fcmp);
+  return true;
+}
+
 bool RISCVCodeGenPrepare::runOnFunction(Function &F) {
   if (skipFunction(F))
     return false;
diff --git a/llvm/test/CodeGen/RISCV/is-fpclass-f32.ll b/llvm/test/CodeGen/RISCV/is-fpclass-f32.ll
new file mode 100644
index 00000000000000..dd918ba7a1a8d7
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/is-fpclass-f32.ll
@@ -0,0 +1,1097 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 -mattr=+f -verify-machineinstrs -target-abi=ilp32f < %s \
+; RUN:   | FileCheck -check-prefix=RV32IF %s
+; RUN: llc -mtriple=riscv32 -mattr=+zfinx -verify-machineinstrs -target-abi=ilp32 < %s \
+; RUN:   | FileCheck -check-prefix=RV32IZFINX %s
+; RUN: llc -mtriple=riscv32 -mattr=+d -verify-machineinstrs -target-abi=ilp32f < %s \
+; RUN:   | FileCheck -check-prefix=RV32IF %s
+; RUN: llc -mtriple=riscv64 -mattr=+f -verify-machineinstrs -target-abi=lp64f < %s \
+; RUN:   | FileCheck -check-prefix=RV64IF %s
+; RUN: llc -mtriple=riscv64 -mattr=+zfinx -verify-machineinstrs -target-abi=lp64 < %s \
+; RUN:   | FileCheck -check-prefix=RV64IZFINX %s
+; RUN: llc -mtriple=riscv64 -mattr=+d -verify-machineinstrs -target-abi=lp64d < %s \
+; RUN:   | FileCheck -check-prefix=RV64IF %s
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV32I %s
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV64I %s
+
+declare i1 @llvm.is.fpclass.f32(float, i32)
+
+define i1 @fpclass(float %x) {
+; RV32IF-LABEL: fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 927
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 927
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 927
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 927
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a0, 1
+; RV32I-NEXT:    lui a2, 2048
+; RV32I-NEXT:    slti a0, a0, 0
+; RV32I-NEXT:    lui a3, 522240
+; RV32I-NEXT:    lui a4, 1046528
+; RV32I-NEXT:    srli a1, a1, 1
+; RV32I-NEXT:    addi a2, a2, -1
+; RV32I-NEXT:    addi a5, a1, -1
+; RV32I-NEXT:    sltu a2, a5, a2
+; RV32I-NEXT:    xor a5, a1, a3
+; RV32I-NEXT:    slt a3, a3, a1
+; RV32I-NEXT:    add a4, a1, a4
+; RV32I-NEXT:    seqz a1, a1
+; RV32I-NEXT:    seqz a5, a5
+; RV32I-NEXT:    srli a4, a4, 24
+; RV32I-NEXT:    and a2, a2, a0
+; RV32I-NEXT:    or a1, a1, a5
+; RV32I-NEXT:    sltiu a4, a4, 127
+; RV32I-NEXT:    or a1, a1, a2
+; RV32I-NEXT:    or a1, a1, a3
+; RV32I-NEXT:    and a0, a4, a0
+; RV32I-NEXT:    or a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a1, a0
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a2, 2048
+; RV64I-NEXT:    lui a3, 522240
+; RV64I-NEXT:    lui a4, 1046528
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    addiw a2, a2, -1
+; RV64I-NEXT:    slti a1, a1, 0
+; RV64I-NEXT:    addi a5, a0, -1
+; RV64I-NEXT:    sltu a2, a5, a2
+; RV64I-NEXT:    xor a5, a0, a3
+; RV64I-NEXT:    slt a3, a3, a0
+; RV64I-NEXT:    add a4, a0, a4
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    seqz a5, a5
+; RV64I-NEXT:    srliw a4, a4, 24
+; RV64I-NEXT:    and a2, a2, a1
+; RV64I-NEXT:    or a0, a0, a5
+; RV64I-NEXT:    sltiu a4, a4, 127
+; RV64I-NEXT:    or a0, a0, a2
+; RV64I-NEXT:    or a0, a0, a3
+; RV64I-NEXT:    and a1, a4, a1
+; RV64I-NEXT:    or a0, a0, a1
+; RV64I-NEXT:    ret
+  %cmp = call i1 @llvm.is.fpclass.f32(float %x, i32 639)
+  ret i1 %cmp
+}
+
+define i1 @is_nan_fpclass(float %x) {
+; RV32IF-LABEL: is_nan_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 768
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_nan_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 768
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_nan_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 768
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_nan_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 768
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_nan_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    slt a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_nan_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    slt a0, a1, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 3)  ; nan
+  ret i1 %1
+}
+
+define i1 @is_qnan_fpclass(float %x) {
+; RV32IF-LABEL: is_qnan_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    srli a0, a0, 9
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_qnan_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    srli a0, a0, 9
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_qnan_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    srli a0, a0, 9
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_qnan_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    srli a0, a0, 9
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_qnan_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    lui a1, 523264
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    addi a1, a1, -1
+; RV32I-NEXT:    slt a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_qnan_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a1, 523264
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    addiw a1, a1, -1
+; RV64I-NEXT:    slt a0, a1, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 2)  ; qnan
+  ret i1 %1
+}
+
+define i1 @is_snan_fpclass(float %x) {
+; RV32IF-LABEL: is_snan_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    slli a0, a0, 23
+; RV32IF-NEXT:    srli a0, a0, 31
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_snan_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    slli a0, a0, 23
+; RV32IZFINX-NEXT:    srli a0, a0, 31
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_snan_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    slli a0, a0, 55
+; RV64IF-NEXT:    srli a0, a0, 63
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_snan_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    slli a0, a0, 55
+; RV64IZFINX-NEXT:    srli a0, a0, 63
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_snan_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    lui a1, 523264
+; RV32I-NEXT:    lui a2, 522240
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    slt a1, a0, a1
+; RV32I-NEXT:    slt a0, a2, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_snan_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a1, 523264
+; RV64I-NEXT:    lui a2, 522240
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    slt a1, a0, a1
+; RV64I-NEXT:    slt a0, a2, a0
+; RV64I-NEXT:    and a0, a0, a1
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 1)  ; snan
+  ret i1 %1
+}
+
+define i1 @is_inf_fpclass(float %x) {
+; RV32IF-LABEL: is_inf_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 129
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_inf_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 129
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_inf_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 129
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_inf_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 129
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_inf_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    xor a0, a0, a1
+; RV32I-NEXT:    seqz a0, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_inf_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    xor a0, a0, a1
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 516)  ; 0x204 = "inf"
+  ret i1 %1
+}
+
+define i1 @is_posinf_fpclass(float %x) {
+; RV32IF-LABEL: is_posinf_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    slli a0, a0, 24
+; RV32IF-NEXT:    srli a0, a0, 31
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_posinf_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    slli a0, a0, 24
+; RV32IZFINX-NEXT:    srli a0, a0, 31
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_posinf_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    slli a0, a0, 56
+; RV64IF-NEXT:    srli a0, a0, 63
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_posinf_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    slli a0, a0, 56
+; RV64IZFINX-NEXT:    srli a0, a0, 63
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_posinf_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    xor a0, a0, a1
+; RV32I-NEXT:    seqz a0, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_posinf_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a0, a0
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    xor a0, a0, a1
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 512)  ; 0x200 = "+inf"
+  ret i1 %1
+}
+
+define i1 @is_neginf_fpclass(float %x) {
+; RV32IF-LABEL: is_neginf_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 1
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_neginf_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 1
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_neginf_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 1
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_neginf_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 1
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_neginf_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    lui a1, 1046528
+; RV32I-NEXT:    xor a0, a0, a1
+; RV32I-NEXT:    seqz a0, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_neginf_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a0, a0
+; RV64I-NEXT:    lui a1, 1046528
+; RV64I-NEXT:    xor a0, a0, a1
+; RV64I-NEXT:    seqz a0, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 4)  ; "-inf"
+  ret i1 %1
+}
+
+define i1 @is_finite_fpclass(float %x) {
+; RV32IF-LABEL: is_finite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 126
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_finite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 126
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_finite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 126
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_finite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 126
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_finite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    slt a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_finite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    slt a0, a0, a1
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 504)  ; 0x1f8 = "finite"
+  ret i1 %1
+}
+
+define i1 @is_posfinite_fpclass(float %x) {
+; RV32IF-LABEL: is_posfinite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 112
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_posfinite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 112
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_posfinite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 112
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_posfinite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 112
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_posfinite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    srli a0, a0, 23
+; RV32I-NEXT:    sltiu a0, a0, 255
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_posfinite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    srliw a0, a0, 23
+; RV64I-NEXT:    sltiu a0, a0, 255
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 448)  ; 0x1c0 = "+finite"
+  ret i1 %1
+}
+
+define i1 @is_negfinite_fpclass(float %x) {
+; RV32IF-LABEL: is_negfinite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 14
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_negfinite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 14
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_negfinite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 14
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_negfinite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 14
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_negfinite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a0, 1
+; RV32I-NEXT:    lui a2, 522240
+; RV32I-NEXT:    srli a1, a1, 1
+; RV32I-NEXT:    slt a1, a1, a2
+; RV32I-NEXT:    slti a0, a0, 0
+; RV32I-NEXT:    and a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_negfinite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sext.w a1, a0
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a2, 522240
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    slt a0, a0, a2
+; RV64I-NEXT:    slti a1, a1, 0
+; RV64I-NEXT:    and a0, a0, a1
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 56)  ; 0x38 = "-finite"
+  ret i1 %1
+}
+
+define i1 @is_notfinite_fpclass(float %x) {
+; RV32IF-LABEL: is_notfinite_fpclass:
+; RV32IF:       # %bb.0:
+; RV32IF-NEXT:    fclass.s a0, fa0
+; RV32IF-NEXT:    andi a0, a0, 897
+; RV32IF-NEXT:    snez a0, a0
+; RV32IF-NEXT:    ret
+;
+; RV32IZFINX-LABEL: is_notfinite_fpclass:
+; RV32IZFINX:       # %bb.0:
+; RV32IZFINX-NEXT:    fclass.s a0, a0
+; RV32IZFINX-NEXT:    andi a0, a0, 897
+; RV32IZFINX-NEXT:    snez a0, a0
+; RV32IZFINX-NEXT:    ret
+;
+; RV64IF-LABEL: is_notfinite_fpclass:
+; RV64IF:       # %bb.0:
+; RV64IF-NEXT:    fclass.s a0, fa0
+; RV64IF-NEXT:    andi a0, a0, 897
+; RV64IF-NEXT:    snez a0, a0
+; RV64IF-NEXT:    ret
+;
+; RV64IZFINX-LABEL: is_notfinite_fpclass:
+; RV64IZFINX:       # %bb.0:
+; RV64IZFINX-NEXT:    fclass.s a0, a0
+; RV64IZFINX-NEXT:    andi a0, a0, 897
+; RV64IZFINX-NEXT:    snez a0, a0
+; RV64IZFINX-NEXT:    ret
+;
+; RV32I-LABEL: is_notfinite_fpclass:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a0, a0, 1
+; RV32I-NEXT:    lui a1, 522240
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    addi a1, a1, -1
+; RV32I-NEXT:    slt a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: is_notfinite_fpclass:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    slli a0, a0, 33
+; RV64I-NEXT:    lui a1, 522240
+; RV64I-NEXT:    srli a0, a0, 33
+; RV64I-NEXT:    addiw a1, a1, -1
+; RV64I-NEXT:    slt a0, a1, a0
+; RV64I-NEXT:    ret
+  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 519)  ; ox207 = "inf|nan"
+  ret i1 %1
+}
+
+define ...
[truncated]

github-actions · 2024-12-17T15:06:36Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 17b3dd03a05dfa938aacd57027189271a62e2fda 0dff7fbbad04e4d711c9beb84e227c81137f2c5d --extensions cpp -- llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp

View the diff from clang-format here.

diff --git a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
index 9bee2ff259..0416f315fe 100644
--- a/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
+++ b/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
@@ -20,8 +20,8 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/InstVisitor.h"
+#include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/InitializePasses.h"

arsenm · 2024-12-17T15:11:45Z

llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp

+// The 'fcmp uno/ord/oeq/une/ueq/one/ogt/oge/olt/ole x, 0.0' instructions are
+// equivalent to an FP class test. If the fcmp instruction would be custom


This isn't true depending on denormal handling and fp exceptions

arsenm · 2024-12-17T15:11:55Z

llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp

+//
+// This basically reverts the transformations of
+// InstCombinerImpl::foldIntrinsicIsFPClass.
+bool RISCVCodeGenPrepare::visitFCmpInst(FCmpInst &Fcmp) {


CodeGenPrepare already has this transform

arsenm · 2024-12-17T15:12:24Z

llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp

+  auto LegalizeTypeAction = TLI->getTypeAction(Fcmp.getContext(), VT);
+  auto OperationAction = TLI->getOperationAction(ISDOpcode, VT);
+  if ((LegalizeTypeAction != TargetLoweringBase::TypeSoftenFloat &&
+       LegalizeTypeAction != TargetLoweringBase::TypeSoftPromoteHalf) ||
+      OperationAction == TargetLowering::Custom)
+    return false;


This level of logic really belongs directly in the legalizer

Ok, I understand. Moving it to the legalizer. @topperc is it ok to implement it there?

Actually the logic in CodeGenPrepare uses TargetLoweringBase::isFAbsFree. When I started to implement this, I was wondering if there should be a similar function for FCmp, and the whole thing should go into CodegenPrepare instead. Is this a valid approach? Or the right place to do it is in the legalizer.

I don't really like having it in codegenprepare in the first place. It really belongs in some combination of DAGCombiner or legalizer, depending on the purpose. The only nice thing is codegenprepare has access to better utilities, like an existing fcmpToClassTest helper and computeKnownFPClass. In principle those should be reimplemented in codegen

Ok. Regarding the InstCombinerImpl::foldIntrinsicIsFPClass. For a back-end, where lowering the fcmp is not cheap, why is it profitable to do this transformation? It is done unconditionally as far as I see it.

fcmp is a better canonical form. More code will always understand fcmp than is.fpclass.

fcmp is not cheap

This is certainly not universally true, and I would say is not the common case. If the target wants something else, that's for the backend to undo for its preferred form.

dtcxzyw · 2024-12-17T15:12:46Z

FYI I did similar transformation in CodeGenPrepare: #81572.

futog · 2024-12-17T15:22:03Z

FYI I did similar transformation in CodeGenPrepare: #81572.

In CodeGenPrepare it is only for fcInf || fcInf | fcNan right? Aren't we considering the rest of the classifications on purpose?

dtcxzyw · 2024-12-17T15:24:55Z

FYI I did similar transformation in CodeGenPrepare: #81572.

In CodeGenPrepare it is only for fcInf || fcInf | fcNan right? Aren't we considering the rest of the classifications on purpose?

I mean you can handle your motivating case in CGP.

llvmbot added the backend:RISC-V label Dec 17, 2024

dtcxzyw requested review from arsenm and topperc December 17, 2024 15:09

arsenm reviewed Dec 17, 2024

View reviewed changes

futog closed this Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Transform fcmp to is.fpclass #120242

[RISCV] Transform fcmp to is.fpclass #120242

Uh oh!

futog commented Dec 17, 2024

Uh oh!

llvmbot commented Dec 17, 2024

Uh oh!

github-actions bot commented Dec 17, 2024

Uh oh!

arsenm Dec 17, 2024

Uh oh!

arsenm Dec 17, 2024

Uh oh!

arsenm Dec 17, 2024

Uh oh!

futog Dec 17, 2024

Uh oh!

futog Dec 17, 2024

Uh oh!

arsenm Dec 17, 2024

Uh oh!

futog Dec 17, 2024

Uh oh!

arsenm Dec 18, 2024

Uh oh!

dtcxzyw commented Dec 17, 2024

Uh oh!

futog commented Dec 17, 2024

Uh oh!

dtcxzyw commented Dec 17, 2024

Uh oh!

Uh oh!

		// The 'fcmp uno/ord/oeq/une/ueq/one/ogt/oge/olt/ole x, 0.0' instructions are
		// equivalent to an FP class test. If the fcmp instruction would be custom

[RISCV] Transform fcmp to is.fpclass #120242

[RISCV] Transform fcmp to is.fpclass #120242

Uh oh!

Conversation

futog commented Dec 17, 2024

Uh oh!

llvmbot commented Dec 17, 2024

Uh oh!

github-actions bot commented Dec 17, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dtcxzyw commented Dec 17, 2024

Uh oh!

futog commented Dec 17, 2024

Uh oh!

dtcxzyw commented Dec 17, 2024

Uh oh!

Uh oh!