-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[clang] -O2
requires -ftrapping-math
or -ffast-math
on amd64 to avoid false positives with feenableexcept(FE_DIVBYZERO)
#118265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@andykaylor here: you said:
Does clang also applies some implicit optimizations to generated intrinsics as well (like By generated intrinsics I mean clang deciding to use |
This is something I found when tracking down #118152 on my end by fiddling with compiler flags: I was hoping that using But I found this instead. |
So @slipher pointed out to me the usage of
The quote is from: https://clang.llvm.org/docs/UsersManual.html It means the default behavior makes But since it's a glibc function, maybe clang cannot do anything about it… |
This code doesn't have FENV_ACCESS on for most of the flags you demonstrated, meaning feenableexcept works or exceptions are observed. We REALLY need to have a warning if any of these functions are used without it enabled |
In that comment I was referring to the target-specific intrinsics declared in immintrin.h and the related headers that it includes. The use of divss is something else. That's just an implementation detail determined by the backend. The problem you're running into here is that by default clang does not allow access to the floating-point environment. Unless you have doene something to enable access to the floating-point environment (such as This isn't really specific to SSE. The same assumptions are made for any target, but it may be exposed in different ways on different targets. |
You need to use |
Oh! That's good to know! I have seen this pragma used in some macOS example but most Linux/BSD documentation seem to not mention it (example, example, example). A warning would help a lot indeed. I guess I got all the answers I needed. |
gcc doesn't support the |
Oh, that explains why documentations I found mentioning this pragma were about macOS. So the command-line flags that achieve the same effect would be |
There's a few other flags that also have the practical effect of |
Hi, using this sample code:
By compiling it with
-O2
, the compiled code raises somedivision by zero
exception when executed:My guess is that the SSE optimized code raises division by zero exceptions from unused fields of the xmm registers, despite we discard those results.
The requirement of
-O2
requiring either-ftrapping-math
or-ffast-math
is curious since-ffast-math
sets-fno-trapping-math
.Also, I thought
-ftrapping-math
was the default, and that-fno-trapping-math
is part of-ffast-math
, but just using-O2
behaves like if-fno-trapping-math
is used, so it behaves like if part of-ffast-math
was enabled anyway… But maybe-ffast-math
just modifies other behaviors that make-ftrapping-math
or-fno-trapping-math
irrelevant.Using the Godbolt's compiler explorer (here Clang 19.1.0), it only works with
-O2 -ffast-math
(or depreacted-Ofast
):-O0
-Os
-Os -fno-trapping-math
-Os -ftrapping-math
-O1
-O2
-O2 -fno-trapping-math
-O2 -ftrapping-math
-ffast-math
-O2 -ffast-math
-O2 -ffast-math -ftrapping-math
-O2 -ffast-math -fno-trapping-math
-Ofast
See: https://godbolt.org/z/395c8nMef
And (with
-ftrapping-math
added): https://godbolt.org/z/czbGTjs6EAnd (with
-fno-trapping-math
added): https://godbolt.org/z/zYr5Pba3WWith 32-bit i686 I reproduce the bug when using SSE but get no bug when not using SSE.
-m32 -msse
-m32 -mno-sse
-Os
-O1
-O2
-ffast-math
-O2 -ffast-math
See 32-bit i686 with SSE I get the same failure: https://godbolt.org/z/199eqhzGh
And 32-bit i686 build without SSE I get no failure: https://godbolt.org/z/sfd9TsTj4
On my end on Ubuntu 24.04 with amd64, I get same results with clang 19.1.5.
On Ubuntu 24.04 with amd64 and different versions of the clang compiler, I only get it working with clang 13 and 14, every later version breaks it:
-O0
-O1
-O2
-O2 -ftrapping-math
-ffast-math
-O2 -ffast-math
On GCC 14.02 I get none of those issues (no one false positive
division by zero
error is raised whatever the compiler flags being used: https://godbolt.org/z/rxMWM68McNote: Disabling SSE on amd64 just produces garbage computation (
1.0/1.0
gives0.0
), but I don't know if that makes sense to disable SSE on amd64. GCC produces the same garbage (1.0/1.0
giving0.0
). Though I'm surprised to not get a warning if that's not legit to do, also I'm surprised the generated code runs if that's not legit to do. See: https://godbolt.org/z/77PYGTc5b (Clang) and: https://godbolt.org/z/cvW4Er3Kd (GCC).The text was updated successfully, but these errors were encountered: