-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[IR] Allow fast math flags on fptrunc and fpext #115894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This consists of: * Make these instructions part of FPMathOperator. * Adjust bitcode/ir readers/writers to expect fast math flags on these instructions. * Make IRBuilder set the fast math flags on these instructions. * Adjust InstCombinerImpl::visitFPTrunc, as currently the fast math flags on all of the output instructions are set from the source operation. Instead we want each output instruction to get the fast math flags from their corresponding input instruction. * Update langref and release notes. * Update a bunch of tests.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-amdgpu Author: John Brawn (john-brawn-arm) ChangesThis consists of:
Patch is 95.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115894.diff 15 Files Affected:
diff --git a/clang/test/CodeGen/X86/cx-complex-range.c b/clang/test/CodeGen/X86/cx-complex-range.c
index a0e6dc219b36f7..f87091427df713 100644
--- a/clang/test/CodeGen/X86/cx-complex-range.c
+++ b/clang/test/CodeGen/X86/cx-complex-range.c
@@ -266,14 +266,14 @@
// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// BASIC_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// BASIC_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// BASIC_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// BASIC_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// BASIC_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// BASIC_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// BASIC_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
// BASIC_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
// BASIC_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]]
@@ -285,8 +285,8 @@
// BASIC_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]]
// BASIC_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]]
// BASIC_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]]
-// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half
-// BASIC_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half
+// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP9]] to half
+// BASIC_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP10]] to half
// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// BASIC_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -307,22 +307,22 @@
// FULL_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// FULL_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// FULL_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// FULL_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// FULL_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// FULL_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// FULL_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// FULL_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// FULL_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// FULL_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// FULL_FAST-NEXT: [[CALL:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__divsc3(float noundef nofpclass(nan inf) [[EXT]], float noundef nofpclass(nan inf) [[EXT1]], float noundef nofpclass(nan inf) [[EXT2]], float noundef nofpclass(nan inf) [[EXT3]]) #[[ATTR1:[0-9]+]]
// FULL_FAST-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4
// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[COERCE]], i32 0, i32 0
// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4
// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[COERCE]], i32 0, i32 1
// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4
-// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[COERCE_REAL]] to half
-// FULL_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[COERCE_IMAG]] to half
+// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[COERCE_REAL]] to half
+// FULL_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[COERCE_IMAG]] to half
// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// FULL_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -342,14 +342,14 @@
// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// IMPRVD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// IMPRVD_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// IMPRVD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// IMPRVD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// IMPRVD_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// IMPRVD_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT2]])
// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT3]])
// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt float [[TMP0]], [[TMP1]]
@@ -379,8 +379,8 @@
// IMPRVD_FAST: complex_div:
// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ]
// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ]
-// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP20]] to half
-// IMPRVD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP21]] to half
+// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP20]] to half
+// IMPRVD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP21]] to half
// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// IMPRVD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -400,14 +400,14 @@
// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// PRMTD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// PRMTD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// PRMTD_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// PRMTD_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// PRMTD_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
// PRMTD_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
// PRMTD_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]]
@@ -419,8 +419,8 @@
// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]]
// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]]
// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]]
-// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half
-// PRMTD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half
+// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP9]] to half
+// PRMTD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP10]] to half
// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// PRMTD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -636,22 +636,22 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// BASIC_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// BASIC_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// BASIC_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// BASIC_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// BASIC_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// BASIC_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// BASIC_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
// BASIC_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
// BASIC_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]]
// BASIC_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]]
// BASIC_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]]
// BASIC_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]]
-// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half
-// BASIC_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half
+// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_R]] to half
+// BASIC_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_I]] to half
// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// BASIC_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -672,14 +672,14 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
// FULL_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// FULL_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// FULL_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// FULL_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// FULL_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// FULL_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// FULL_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// FULL_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// FULL_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// FULL_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// FULL_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
// FULL_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
// FULL_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]]
@@ -702,8 +702,8 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
// FULL_FAST: complex_mul_cont:
// FULL_FAST-NEXT: [[REAL_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_REAL]], [[COMPLEX_MUL_LIBCALL]] ]
// FULL_FAST-NEXT: [[IMAG_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_IMAG]], [[COMPLEX_MUL_LIBCALL]] ]
-// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[REAL_MUL_PHI]] to half
-// FULL_FAST-NEXT: [[UNPROMOTION5:%.*]] = fptrunc float [[IMAG_MUL_PHI]] to half
+// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[REAL_MUL_PHI]] to half
+// FULL_FAST-NEXT: [[UNPROMOTION5:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[IMAG_MUL_PHI]] to half
// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// FULL_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -723,22 +723,22 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// IMPRVD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// IMPRVD_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// IMPRVD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// IMPRVD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// IMPRVD_FAST-NEXT: [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// IMPRVD_FAST-NEXT: [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
// IMPRVD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
// IMPRVD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
// IMPRVD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]]
// IMPRVD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]]
// IMPRVD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]]
// IMPRVD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]]
-// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half
-// IMPRVD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half
+// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_R]] to half
+// IMPRVD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_I]] to half
// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
// IMPRVD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -758,22 +758,22 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
/...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
floating-point transformations. | ||
:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`, :ref:`fptrunc <i_fptrunc>`, | ||
:ref:`fpext <i_fpext>`), and :ref:`phi <i_phi>`, :ref:`select <i_select>`, or | ||
:ref:`call <i_call>` instructions that return floating-point types may use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The syntax of fptrunc
and fpext
forbids fast math flags. You may want to update the two sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, fptrunc
and fpext
need updating to at least show where the fast-math flags may occur.
@@ -42,6 +42,14 @@ entry: | |||
%f = fneg float %x | |||
; CHECK: %f_vec = fneg <3 x float> %vec | |||
%f_vec = fneg <3 x float> %vec | |||
; CHECK: %g = fpext float %x to double |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs bitcode compatibility test
; CHECK: %g = fptrunc nnan ninf float %x to half | ||
%g = fptrunc ninf nnan float %x to half | ||
; CHECK: %g_vec = fptrunc nnan ninf <3 x float> %vec to <3 x half> | ||
%g_vec = fptrunc ninf nnan <3 x float> %vec to <3 x half> | ||
; CHECK: ret float %e | ||
ret float %e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test scalable vector case
Add fast math flags to fptrunc and fpext in langref Add scalable vector test Add test in Bitcode/compatibility.ll
return replaceInstUsesWith(FPT, Sel); | ||
} | ||
if (match(Op, m_Select(m_Value(Cond), m_Value(Y), m_FPExt(m_Value(X)))) && | ||
X->getType() == Ty) { | ||
// fptrunc (select Cond, Y, (fpext X) --> select Cond, (fptrunc Y), X | ||
Value *NarrowY = Builder.CreateFPTrunc(Y, Ty); | ||
Value *Sel = Builder.CreateSelect(Cond, NarrowY, X, "narrow.sel", Op); | ||
if (auto *I = dyn_cast<Instruction>(Sel)) | ||
I->setFastMathFlags(Op->getFastMathFlags()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already propagate FMF in IRBuilder::CreateSelect
. BTW it is incorrect to preserve ninf
: #116073 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IRBuilder::CreateSelect uses the default fast math flags of the IRBuilder, which will be the flags of the fptrunc. So we need this to make sure the new select has the same flags as the old select.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I got MDFrom and FMFSource mixed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to split off this change? The rest of the patch looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean these specific changes for setting the fast math flag on select instructions, or also the change at the top of visitFPTrunc to set the fast math flags in the builder based on the fptrunc? Either way it causes failures in the llvm/test/Transforms/InstCombine/fpcast.ll and llvm/test/Transforms/InstCombine/fptrunc.ll tests. Without setting the fast math flags in the select the failures are because the select gets the fast math flags from the fptrunc. Without setting the flags in the builder the opposite happens and the fptrunc gets the fast math flags from the select.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would split the instcombine change into a separate PR, and leave this one purely mechanical to add the flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is possible. Applying this patch first then later applying a patch to instcombine would cause test failures, as I mention in my comment above. Applying a patch to instcombine first doesn't work, because Instruction::getFastMathFlags fails an assert when the instruction isn't an FPMathOperator, which fptrunc wouldn't be at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failures are just lost flag regressions? Just take the regression and let it be fixed in the follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's some followup work needed to get this working in SelectionDAG as well, but that can live in a separate patch (especially as we need to do a followup pass in DAGCombine to check for flags on the fpext/fptrunc nodes).
Instcombine follow up : #118808 |
This consists of: