-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Buggy optimization of vfmaddcsh
intrinsics
#98306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
@llvm/issue-subscribers-backend-x86 Author: Sayantan Chakraborty (sayantn)
The `llvm.x86.avx512fp16.maskz.vfmadd.csh` intrinsic (and due to that, `_mm_maskz_fmadd_sch`) is being incorrectly optimized. This code snippet
#include<immintrin.h>
#include<stdio.h>
int main() {
__m128h a, b, c, r;
_Float16 array[8];
a = _mm_setr_ph(0.0, 1.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0);
b = _mm_setr_ph(0.0, 2.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0);
c = _mm_setr_ph(0.0, 3.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0);
r = _mm_maskz_fmadd_sch(0, a, b, c);
_mm_storeu_ph(array, r);
for (int i = 0; i < 8; i++){
printf("%f\n", (float) array[i]);
}
return 0;
} In System specification:
|
phoebewang
added a commit
that referenced
this issue
Nov 29, 2024
sayantn
added a commit
to sayantn/stdarch
that referenced
this issue
Apr 13, 2025
sayantn
added a commit
to sayantn/stdarch
that referenced
this issue
Apr 13, 2025
sayantn
added a commit
to sayantn/stdarch
that referenced
this issue
Apr 21, 2025
sayantn
added a commit
to sayantn/stdarch
that referenced
this issue
Apr 21, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
llvm.x86.avx512fp16.maskz.vfmadd.csh
intrinsic (and due to that,_mm_maskz_fmadd_sch
) is being incorrectly optimized. This code snippetIn
clang
, the unoptimized and optimized output is different. The unoptimized output is the correct one according to Intel.gcc
gives the correct output in both.System specification:
mingw-w64-x86_64-gcc 14.2.0
mingw-w64-x86_64-clang 18.1.8
The text was updated successfully, but these errors were encountered: