[IR] Allow fast math flags on fptrunc and fpext #115894

john-brawn-arm · 2024-11-12T16:19:43Z

This consists of:

Make these instructions part of FPMathOperator.
Adjust bitcode/ir readers/writers to expect fast math flags on these instructions.
Make IRBuilder set the fast math flags on these instructions.
Update langref and release notes.
Update a bunch of tests. Some of these are due to InstCombineCasts incorrectly adding fast math flags to fptrunc, which will be fixed in a later patch.

This consists of: * Make these instructions part of FPMathOperator. * Adjust bitcode/ir readers/writers to expect fast math flags on these instructions. * Make IRBuilder set the fast math flags on these instructions. * Adjust InstCombinerImpl::visitFPTrunc, as currently the fast math flags on all of the output instructions are set from the source operation. Instead we want each output instruction to get the fast math flags from their corresponding input instruction. * Update langref and release notes. * Update a bunch of tests.

llvmbot · 2024-11-12T16:20:24Z

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-amdgpu

Author: John Brawn (john-brawn-arm)

Changes

This consists of:

Make these instructions part of FPMathOperator.
Adjust bitcode/ir readers/writers to expect fast math flags on these instructions.
Make IRBuilder set the fast math flags on these instructions.
Adjust InstCombinerImpl::visitFPTrunc, as currently the fast math flags on all of the output instructions are set from the source operation. Instead we want each output instruction to get the fast math flags from their corresponding input instruction.
Update langref and release notes.
Update a bunch of tests.

Patch is 95.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115894.diff

15 Files Affected:

(modified) clang/test/CodeGen/X86/cx-complex-range.c (+88-88)
(modified) clang/test/CodeGen/cx-complex-range.c (+34-34)
(modified) clang/test/CodeGen/matrix-type-operators-fast-math.c (+6-6)
(modified) clang/test/CodeGen/nofpclass.c (+30-30)
(modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1)
(modified) clang/test/CodeGenHIP/printf_nonhostcall.cpp (+2-2)
(modified) clang/test/Headers/__clang_hip_math_ocml_rounded_ops.hip (+16-16)
(modified) llvm/docs/LangRef.rst (+8-4)
(modified) llvm/docs/ReleaseNotes.md (+2)
(modified) llvm/include/llvm/IR/IRBuilder.h (+15-10)
(modified) llvm/include/llvm/IR/Operator.h (+2)
(modified) llvm/lib/AsmParser/LLParser.cpp (+10-2)
(modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+5)
(modified) llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (+5-3)
(modified) llvm/test/Assembler/fast-math-flags.ll (+32)

diff --git a/clang/test/CodeGen/X86/cx-complex-range.c b/clang/test/CodeGen/X86/cx-complex-range.c
index a0e6dc219b36f7..f87091427df713 100644
--- a/clang/test/CodeGen/X86/cx-complex-range.c
+++ b/clang/test/CodeGen/X86/cx-complex-range.c
@@ -266,14 +266,14 @@
 // BASIC_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // BASIC_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // BASIC_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// BASIC_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// BASIC_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// BASIC_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// BASIC_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // BASIC_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // BASIC_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // BASIC_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // BASIC_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// BASIC_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// BASIC_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// BASIC_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// BASIC_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // BASIC_FAST-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
 // BASIC_FAST-NEXT:    [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
 // BASIC_FAST-NEXT:    [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]]
@@ -285,8 +285,8 @@
 // BASIC_FAST-NEXT:    [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]]
 // BASIC_FAST-NEXT:    [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]]
 // BASIC_FAST-NEXT:    [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]]
-// BASIC_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half
-// BASIC_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half
+// BASIC_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP9]] to half
+// BASIC_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP10]] to half
 // BASIC_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // BASIC_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // BASIC_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -307,22 +307,22 @@
 // FULL_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // FULL_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // FULL_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// FULL_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// FULL_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// FULL_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// FULL_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // FULL_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // FULL_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // FULL_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // FULL_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// FULL_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// FULL_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// FULL_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// FULL_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // FULL_FAST-NEXT:    [[CALL:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__divsc3(float noundef nofpclass(nan inf) [[EXT]], float noundef nofpclass(nan inf) [[EXT1]], float noundef nofpclass(nan inf) [[EXT2]], float noundef nofpclass(nan inf) [[EXT3]]) #[[ATTR1:[0-9]+]]
 // FULL_FAST-NEXT:    store <2 x float> [[CALL]], ptr [[COERCE]], align 4
 // FULL_FAST-NEXT:    [[COERCE_REALP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[COERCE]], i32 0, i32 0
 // FULL_FAST-NEXT:    [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4
 // FULL_FAST-NEXT:    [[COERCE_IMAGP:%.*]] = getelementptr inbounds nuw { float, float }, ptr [[COERCE]], i32 0, i32 1
 // FULL_FAST-NEXT:    [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4
-// FULL_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[COERCE_REAL]] to half
-// FULL_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc float [[COERCE_IMAG]] to half
+// FULL_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[COERCE_REAL]] to half
+// FULL_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[COERCE_IMAG]] to half
 // FULL_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // FULL_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // FULL_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -342,14 +342,14 @@
 // IMPRVD_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // IMPRVD_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // IMPRVD_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// IMPRVD_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// IMPRVD_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// IMPRVD_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// IMPRVD_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // IMPRVD_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // IMPRVD_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // IMPRVD_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // IMPRVD_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// IMPRVD_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// IMPRVD_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// IMPRVD_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// IMPRVD_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // IMPRVD_FAST-NEXT:    [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT2]])
 // IMPRVD_FAST-NEXT:    [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT3]])
 // IMPRVD_FAST-NEXT:    [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt float [[TMP0]], [[TMP1]]
@@ -379,8 +379,8 @@
 // IMPRVD_FAST:       complex_div:
 // IMPRVD_FAST-NEXT:    [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ]
 // IMPRVD_FAST-NEXT:    [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ]
-// IMPRVD_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[TMP20]] to half
-// IMPRVD_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc float [[TMP21]] to half
+// IMPRVD_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP20]] to half
+// IMPRVD_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP21]] to half
 // IMPRVD_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // IMPRVD_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // IMPRVD_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -400,14 +400,14 @@
 // PRMTD_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // PRMTD_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // PRMTD_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// PRMTD_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// PRMTD_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// PRMTD_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// PRMTD_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // PRMTD_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // PRMTD_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // PRMTD_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // PRMTD_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// PRMTD_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// PRMTD_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// PRMTD_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// PRMTD_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // PRMTD_FAST-NEXT:    [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
 // PRMTD_FAST-NEXT:    [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
 // PRMTD_FAST-NEXT:    [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]]
@@ -419,8 +419,8 @@
 // PRMTD_FAST-NEXT:    [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]]
 // PRMTD_FAST-NEXT:    [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]]
 // PRMTD_FAST-NEXT:    [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]]
-// PRMTD_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half
-// PRMTD_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half
+// PRMTD_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP9]] to half
+// PRMTD_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[TMP10]] to half
 // PRMTD_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // PRMTD_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // PRMTD_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -636,22 +636,22 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
 // BASIC_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // BASIC_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // BASIC_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// BASIC_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// BASIC_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// BASIC_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// BASIC_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // BASIC_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // BASIC_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // BASIC_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // BASIC_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// BASIC_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// BASIC_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// BASIC_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// BASIC_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // BASIC_FAST-NEXT:    [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
 // BASIC_FAST-NEXT:    [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
 // BASIC_FAST-NEXT:    [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]]
 // BASIC_FAST-NEXT:    [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]]
 // BASIC_FAST-NEXT:    [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]]
 // BASIC_FAST-NEXT:    [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]]
-// BASIC_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half
-// BASIC_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half
+// BASIC_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_R]] to half
+// BASIC_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_I]] to half
 // BASIC_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // BASIC_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // BASIC_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -672,14 +672,14 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
 // FULL_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // FULL_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // FULL_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// FULL_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// FULL_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// FULL_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// FULL_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // FULL_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // FULL_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // FULL_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // FULL_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// FULL_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// FULL_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// FULL_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// FULL_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // FULL_FAST-NEXT:    [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
 // FULL_FAST-NEXT:    [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
 // FULL_FAST-NEXT:    [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]]
@@ -702,8 +702,8 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
 // FULL_FAST:       complex_mul_cont:
 // FULL_FAST-NEXT:    [[REAL_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_REAL]], [[COMPLEX_MUL_LIBCALL]] ]
 // FULL_FAST-NEXT:    [[IMAG_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_IMAG]], [[COMPLEX_MUL_LIBCALL]] ]
-// FULL_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[REAL_MUL_PHI]] to half
-// FULL_FAST-NEXT:    [[UNPROMOTION5:%.*]] = fptrunc float [[IMAG_MUL_PHI]] to half
+// FULL_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[REAL_MUL_PHI]] to half
+// FULL_FAST-NEXT:    [[UNPROMOTION5:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[IMAG_MUL_PHI]] to half
 // FULL_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // FULL_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // FULL_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -723,22 +723,22 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
 // IMPRVD_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // IMPRVD_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // IMPRVD_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// IMPRVD_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// IMPRVD_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// IMPRVD_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// IMPRVD_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // IMPRVD_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // IMPRVD_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 // IMPRVD_FAST-NEXT:    [[B_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 1
 // IMPRVD_FAST-NEXT:    [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2
-// IMPRVD_FAST-NEXT:    [[EXT2:%.*]] = fpext half [[B_REAL]] to float
-// IMPRVD_FAST-NEXT:    [[EXT3:%.*]] = fpext half [[B_IMAG]] to float
+// IMPRVD_FAST-NEXT:    [[EXT2:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_REAL]] to float
+// IMPRVD_FAST-NEXT:    [[EXT3:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[B_IMAG]] to float
 // IMPRVD_FAST-NEXT:    [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]]
 // IMPRVD_FAST-NEXT:    [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]]
 // IMPRVD_FAST-NEXT:    [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]]
 // IMPRVD_FAST-NEXT:    [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]]
 // IMPRVD_FAST-NEXT:    [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]]
 // IMPRVD_FAST-NEXT:    [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]]
-// IMPRVD_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half
-// IMPRVD_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half
+// IMPRVD_FAST-NEXT:    [[UNPROMOTION:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_R]] to half
+// IMPRVD_FAST-NEXT:    [[UNPROMOTION4:%.*]] = fptrunc reassoc nnan ninf nsz arcp afn float [[MUL_I]] to half
 // IMPRVD_FAST-NEXT:    [[RETVAL_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 0
 // IMPRVD_FAST-NEXT:    [[RETVAL_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[RETVAL]], i32 0, i32 1
 // IMPRVD_FAST-NEXT:    store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2
@@ -758,22 +758,22 @@ _Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) {
 // PRMTD_FAST-NEXT:    [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2
 // PRMTD_FAST-NEXT:    [[A_IMAGP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[A]], i32 0, i32 1
 // PRMTD_FAST-NEXT:    [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2
-// PRMTD_FAST-NEXT:    [[EXT:%.*]] = fpext half [[A_REAL]] to float
-// PRMTD_FAST-NEXT:    [[EXT1:%.*]] = fpext half [[A_IMAG]] to float
+// PRMTD_FAST-NEXT:    [[EXT:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_REAL]] to float
+// PRMTD_FAST-NEXT:    [[EXT1:%.*]] = fpext reassoc nnan ninf nsz arcp afn half [[A_IMAG]] to float
 // PRMTD_FAST-NEXT:    [[B_REALP:%.*]] = getelementptr inbounds nuw { half, half }, ptr [[B]], i32 0, i32 0
 // PRMTD_FAST-NEXT:    [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2
 /...
[truncated]

github-actions · 2024-11-12T16:24:07Z

✅ With the latest revision this PR passed the C/C++ code formatter.

tschuett · 2024-11-12T18:32:26Z

llvm/docs/LangRef.rst

-floating-point transformations.
+:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`, :ref:`fptrunc <i_fptrunc>`,
+:ref:`fpext <i_fpext>`), and :ref:`phi <i_phi>`, :ref:`select <i_select>`, or
+:ref:`call <i_call>` instructions that return floating-point types may use the


The syntax of fptrunc and fpext forbids fast math flags. You may want to update the two sections.

Agree, fptrunc and fpext need updating to at least show where the fast-math flags may occur.

arsenm · 2024-11-12T18:59:40Z

llvm/test/Assembler/fast-math-flags.ll

@@ -42,6 +42,14 @@ entry:
  %f = fneg float %x
 ; CHECK:  %f_vec = fneg <3 x float> %vec
  %f_vec = fneg <3 x float> %vec
+; CHECK: %g = fpext float %x to double


Needs bitcode compatibility test

arsenm · 2024-11-12T18:59:54Z

llvm/test/Assembler/fast-math-flags.ll

+; CHECK: %g = fptrunc nnan ninf float %x to half
+  %g = fptrunc ninf nnan float %x to half
+; CHECK: %g_vec = fptrunc nnan ninf <3 x float> %vec to <3 x half>
+  %g_vec = fptrunc ninf nnan <3 x float> %vec to <3 x half>
 ; CHECK:  ret float %e
  ret float %e


Test scalable vector case

Add fast math flags to fptrunc and fpext in langref Add scalable vector test Add test in Bitcode/compatibility.ll

dtcxzyw · 2024-11-14T01:02:44Z

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp

      return replaceInstUsesWith(FPT, Sel);
    }
    if (match(Op, m_Select(m_Value(Cond), m_Value(Y), m_FPExt(m_Value(X)))) &&
        X->getType() == Ty) {
      // fptrunc (select Cond, Y, (fpext X) --> select Cond, (fptrunc Y), X
      Value *NarrowY = Builder.CreateFPTrunc(Y, Ty);
      Value *Sel = Builder.CreateSelect(Cond, NarrowY, X, "narrow.sel", Op);
+      if (auto *I = dyn_cast<Instruction>(Sel))
+        I->setFastMathFlags(Op->getFastMathFlags());


We already propagate FMF in IRBuilder::CreateSelect. BTW it is incorrect to preserve ninf: #116073 (comment)

IRBuilder::CreateSelect uses the default fast math flags of the IRBuilder, which will be the flags of the fptrunc. So we need this to make sure the new select has the same flags as the old select.

Sorry I got MDFrom and FMFSource mixed up.

Is it possible to split off this change? The rest of the patch looks good to me.

Do you mean these specific changes for setting the fast math flag on select instructions, or also the change at the top of visitFPTrunc to set the fast math flags in the builder based on the fptrunc? Either way it causes failures in the llvm/test/Transforms/InstCombine/fpcast.ll and llvm/test/Transforms/InstCombine/fptrunc.ll tests. Without setting the fast math flags in the select the failures are because the select gets the fast math flags from the fptrunc. Without setting the flags in the builder the opposite happens and the fptrunc gets the fast math flags from the select.

I would split the instcombine change into a separate PR, and leave this one purely mechanical to add the flags

I don't think this is possible. Applying this patch first then later applying a patch to instcombine would cause test failures, as I mention in my comment above. Applying a patch to instcombine first doesn't work, because Instruction::getFastMathFlags fails an assert when the instruction isn't an FPMathOperator, which fptrunc wouldn't be at that point.

The failures are just lost flag regressions? Just take the regression and let it be fixed in the follow up

john-brawn-arm · 2024-11-26T13:37:23Z

Ping

nikic

LGTM

jcranmer-intel

I think there's some followup work needed to get this working in SelectionDAG as well, but that can live in a separate patch (especially as we need to do a followup pass in DAGCombine to check for flags on the fpext/fptrunc nodes).

john-brawn-arm · 2024-12-05T14:09:06Z

Instcombine follow up : #118808

john-brawn-arm requested review from arsenm and jcranmer-intel November 12, 2024 16:19

john-brawn-arm requested a review from nikic as a code owner November 12, 2024 16:19

llvmbot added clang Clang issues not falling into any other category backend:AMDGPU backend:X86 llvm:instcombine llvm:ir llvm:transforms labels Nov 12, 2024

john-brawn-arm mentioned this pull request Nov 12, 2024

[InstCombine] Eliminate fptrunc/fpext if fast math flags allow it #115027

Closed

nikic added the floating-point Floating-point math label Nov 12, 2024

nikic requested a review from andykaylor November 12, 2024 16:27

Fix formatting.

a201f76

tschuett reviewed Nov 12, 2024

View reviewed changes

arsenm reviewed Nov 12, 2024

View reviewed changes

Adjust based on review comments

8a4eccb

Add fast math flags to fptrunc and fpext in langref Add scalable vector test Add test in Bitcode/compatibility.ll

dtcxzyw reviewed Nov 14, 2024

View reviewed changes

john-brawn-arm added 2 commits December 3, 2024 16:21

Don't make any changes to InstCombineCasts, instead change the tests

78031d9

Merge branch 'main' into fpcast_fast_math

26aaa8b

nikic approved these changes Dec 3, 2024

View reviewed changes

jcranmer-intel approved these changes Dec 3, 2024

View reviewed changes

john-brawn-arm merged commit ecbe4d1 into llvm:main Dec 4, 2024
9 checks passed

john-brawn-arm deleted the fpcast_fast_math branch December 5, 2024 14:09

dtcxzyw mentioned this pull request Dec 6, 2024

Add FMF support for fptrunc/fpext AliveToolkit/alive2#1131

Merged

lukel97 mentioned this pull request Dec 13, 2024

[LV][VPlan] set FastMathFlags on EVLRecipe #119847

Open

[IR] Allow fast math flags on fptrunc and fpext #115894

[IR] Allow fast math flags on fptrunc and fpext #115894

Uh oh!

Conversation

john-brawn-arm commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

john-brawn-arm commented Nov 26, 2024

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

jcranmer-intel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

john-brawn-arm commented Dec 5, 2024

Uh oh!

Uh oh!

john-brawn-arm commented Nov 12, 2024 •

edited

Loading

llvmbot commented Nov 12, 2024 •

edited

Loading

github-actions bot commented Nov 12, 2024 •

edited

Loading