Skip to content

[x86] suboptimal codegen for isfinite IR #27538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rotateright opened this issue Mar 31, 2016 · 6 comments
Closed

[x86] suboptimal codegen for isfinite IR #27538

rotateright opened this issue Mar 31, 2016 · 6 comments
Labels
bugzilla Issues migrated from bugzilla floating-point Floating-point math good first issue https://github.com/llvm/llvm-project/contribute llvm:codegen missed-optimization

Comments

@rotateright
Copy link
Contributor

rotateright commented Mar 31, 2016

Bugzilla Link 27164
Version trunk
OS All
CC @hfinkel,@RKSimon

Extended Description

define i1 @is_finite(float %x) {
  %1 = tail call float @llvm.fabs.f32(float %x)
  %2 = fcmp one float %1, 0x7FF0000000000000 ; ordered and not equal
  ret i1 %2
}
declare float @llvm.fabs.f32(float)
$ ./llc -o - isfinite.ll

LCPI0_0:
	.long	2147483647              ## 0x7fffffff
	.long	2147483647              ## 0x7fffffff
	.long	2147483647              ## 0x7fffffff
	.long	2147483647              ## 0x7fffffff
	.section	__TEXT,__literal4,4byte_literals
	.p2align	2
LCPI0_1:
	.long	2139095040              ## float +Inf
	.section	__TEXT,__text,regular,pure_instructions
	.globl _is_finite
	.p2align	4, 0x90
_is_finite:                
	.cfi_startproc
## BB#0:
	andps	LCPI0_0(%rip), %xmm0
	ucomiss	LCPI0_1(%rip), %xmm0
	setne	%al
	retq

Note: 2139095040 = 0x7f800000 (check if the exponent is maxed)

I think this can be reduced to "andnps" with that constant and then ucomiss against zero (save a load).

Alternatively, we could bring the FP value into an int register and do the bitwise comparison there. If we have BMI, it could be something like:

movd %xmm0, %eax
andn (load bitmask), %eax, %eax
setne %al   ## if all exponent bits were not set, the value is finite

This only needs a scalar load and no explicit compare instruction is needed.

@rotateright
Copy link
Contributor Author

This only needs a scalar load and no explicit compare instruction is needed.

On 2nd thought, no loads should be needed at all. The mask should be an immediate constant put into an int register via mov.

@rotateright
Copy link
Contributor Author

rotateright commented Apr 3, 2016

Reminded by:
http://reviews.llvm.org/D18741

Because this is x86, there's a 3rd and possibly 4th alternative that might be optimal for a given uarch:

andnps [mask], %xmm0    <--- note: mask must be appropriate for a scalar op...
cmpeqss [mask], %xmm0
movmskps %xmm0, %eax    <--- because we want this to be strictly 0 or 1

Or try this in the vector integer domain:

pandn [mask], %xmm0
pcmpeqd [mask], %xmm0
pmovmskb %xmm0, %eax    <--- there's nothing but a byte variant for integers?

Ie, there are many ways to get an FP comparison result over to an integer reg.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@arsenm arsenm added llvm:codegen floating-point Floating-point math good first issue https://github.com/llvm/llvm-project/contribute labels Aug 5, 2023
@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2023

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Assign the issue to you.
  2. Fix the issue locally.
  3. Run the test suite locally.
    3.1) Remember that the subdirectories under test/ create fine-grained testing targets, so you can
    e.g. use make check-clang-ast to only run Clang's AST tests.
  4. Create a git commit
  5. Run git clang-format HEAD~1 to format your changes.
  6. Submit the patch to Phabricator.
    6.1) Detailed instructions can be found here

For more instructions on how to submit a patch to LLVM, see our documentation.

If you have any further questions about this issue, don't hesitate to ask via a comment on this Github issue.

@llvm/issue-subscribers-good-first-issue

@tuliom tuliom self-assigned this Oct 20, 2023
@tuliom
Copy link
Contributor

tuliom commented Oct 20, 2023

@7flying is starting to work on this and I'm mentoring.

@RKSimon
Copy link
Collaborator

RKSimon commented Feb 3, 2025

define i1 @is_finite(float %x) {
  %1 = tail call float @llvm.fabs.f32(float %x)
  %2 = fcmp one float %1, 0x7FF0000000000000 ; ordered and not equal
  ret i1 %2
}
declare float @llvm.fabs.f32(float)

CodeGenPrepare now folds this to:

define i1 @is_finite(float %x) {
  %1 = call i1 @llvm.is.fpclass.f32(float %x, i32 504)
  ret i1 %1
}

resulting in:

is_finite:
  vmovd %xmm0, %eax
  andl $2147483647, %eax
  cmpl $2139095040, %eax
  setl %al
  retq

@tuliom tuliom removed their assignment Feb 3, 2025
@RKSimon
Copy link
Collaborator

RKSimon commented Feb 4, 2025

Fixed by #81572

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla floating-point Floating-point math good first issue https://github.com/llvm/llvm-project/contribute llvm:codegen missed-optimization
Projects
None yet
Development

No branches or pull requests

7 participants