-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[LV][AArch64] Add test for fp128 fmuladd reduction.(NFC) #137576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV][AArch64] Add test for fp128 fmuladd reduction.(NFC) #137576
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Elvis Wang (ElvisWang123) ChangesThis patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for #113903. Full diff: https://github.com/llvm/llvm-project/pull/137576.diff 1 Files Affected:
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/f128-fmuladd-reduction.ll b/llvm/test/Transforms/LoopVectorize/AArch64/f128-fmuladd-reduction.ll
new file mode 100644
index 0000000000000..7ae08dd330d24
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/f128-fmuladd-reduction.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v2 -p loop-vectorize %s -S | FileCheck %s
+define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) {
+; CHECK-LABEL: define double @fp128_fmuladd_reduction(
+; CHECK-SAME: ptr [[START0:%.*]], ptr [[START1:%.*]], ptr [[END0:%.*]], ptr [[END1:%.*]], double [[X:%.*]], i64 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[N]], 4
+; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK: [[VECTOR_PH]]:
+; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N]], 4
+; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[N]], [[N_MOD_VF]]
+; CHECK-NEXT: [[TMP0:%.*]] = mul i64 [[N_VEC]], 16
+; CHECK-NEXT: [[TMP1:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP0]]
+; CHECK-NEXT: [[TMP2:%.*]] = mul i64 [[N_VEC]], 8
+; CHECK-NEXT: [[TMP3:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP2]]
+; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
+; CHECK: [[VECTOR_BODY]]:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_PHI:%.*]] = phi double [ [[X]], %[[VECTOR_PH]] ], [ [[TMP29:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[OFFSET_IDX:%.*]] = mul i64 [[INDEX]], 16
+; CHECK-NEXT: [[TMP4:%.*]] = add i64 [[OFFSET_IDX]], 16
+; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[OFFSET_IDX]], 32
+; CHECK-NEXT: [[TMP6:%.*]] = add i64 [[OFFSET_IDX]], 48
+; CHECK-NEXT: [[NEXT_GEP:%.*]] = getelementptr i8, ptr [[START0]], i64 [[OFFSET_IDX]]
+; CHECK-NEXT: [[NEXT_GEP1:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP4]]
+; CHECK-NEXT: [[NEXT_GEP2:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP5]]
+; CHECK-NEXT: [[NEXT_GEP3:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP6]]
+; CHECK-NEXT: [[OFFSET_IDX4:%.*]] = mul i64 [[INDEX]], 8
+; CHECK-NEXT: [[TMP7:%.*]] = add i64 [[OFFSET_IDX4]], 8
+; CHECK-NEXT: [[TMP8:%.*]] = add i64 [[OFFSET_IDX4]], 16
+; CHECK-NEXT: [[TMP9:%.*]] = add i64 [[OFFSET_IDX4]], 24
+; CHECK-NEXT: [[NEXT_GEP5:%.*]] = getelementptr i8, ptr [[START1]], i64 [[OFFSET_IDX4]]
+; CHECK-NEXT: [[NEXT_GEP6:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP7]]
+; CHECK-NEXT: [[NEXT_GEP7:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP8]]
+; CHECK-NEXT: [[NEXT_GEP8:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP9]]
+; CHECK-NEXT: [[TMP10:%.*]] = load fp128, ptr [[NEXT_GEP]], align 16
+; CHECK-NEXT: [[TMP11:%.*]] = load fp128, ptr [[NEXT_GEP1]], align 16
+; CHECK-NEXT: [[TMP12:%.*]] = load fp128, ptr [[NEXT_GEP2]], align 16
+; CHECK-NEXT: [[TMP13:%.*]] = load fp128, ptr [[NEXT_GEP3]], align 16
+; CHECK-NEXT: [[TMP14:%.*]] = load double, ptr [[NEXT_GEP5]], align 16
+; CHECK-NEXT: [[TMP15:%.*]] = load double, ptr [[NEXT_GEP6]], align 16
+; CHECK-NEXT: [[TMP16:%.*]] = load double, ptr [[NEXT_GEP7]], align 16
+; CHECK-NEXT: [[TMP17:%.*]] = load double, ptr [[NEXT_GEP8]], align 16
+; CHECK-NEXT: [[TMP18:%.*]] = fptrunc fp128 [[TMP10]] to double
+; CHECK-NEXT: [[TMP19:%.*]] = fptrunc fp128 [[TMP11]] to double
+; CHECK-NEXT: [[TMP20:%.*]] = fptrunc fp128 [[TMP12]] to double
+; CHECK-NEXT: [[TMP21:%.*]] = fptrunc fp128 [[TMP13]] to double
+; CHECK-NEXT: [[TMP22:%.*]] = fmul double [[TMP18]], [[TMP14]]
+; CHECK-NEXT: [[TMP23:%.*]] = fmul double [[TMP19]], [[TMP15]]
+; CHECK-NEXT: [[TMP24:%.*]] = fmul double [[TMP20]], [[TMP16]]
+; CHECK-NEXT: [[TMP25:%.*]] = fmul double [[TMP21]], [[TMP17]]
+; CHECK-NEXT: [[TMP26:%.*]] = fadd double [[VEC_PHI]], [[TMP22]]
+; CHECK-NEXT: [[TMP27:%.*]] = fadd double [[TMP26]], [[TMP23]]
+; CHECK-NEXT: [[TMP28:%.*]] = fadd double [[TMP27]], [[TMP24]]
+; CHECK-NEXT: [[TMP29]] = fadd double [[TMP28]], [[TMP25]]
+; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT: [[TMP30:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[TMP30]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK: [[MIDDLE_BLOCK]]:
+; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[CMP_N]], label %[[EXIT:.*]], label %[[SCALAR_PH]]
+; CHECK: [[SCALAR_PH]]:
+; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi ptr [ [[TMP1]], %[[MIDDLE_BLOCK]] ], [ [[START0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[BC_RESUME_VAL9:%.*]] = phi ptr [ [[TMP3]], %[[MIDDLE_BLOCK]] ], [ [[START1]], %[[ENTRY]] ]
+; CHECK-NEXT: [[BC_RESUME_VAL10:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi double [ [[TMP29]], %[[MIDDLE_BLOCK]] ], [ [[X]], %[[ENTRY]] ]
+; CHECK-NEXT: br label %[[LOOP:.*]]
+; CHECK: [[LOOP]]:
+; CHECK-NEXT: [[PTR0:%.*]] = phi ptr [ [[PTR0_NEXT:%.*]], %[[LOOP]] ], [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[PTR1:%.*]] = phi ptr [ [[PTR1_NEXT:%.*]], %[[LOOP]] ], [ [[BC_RESUME_VAL9]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], %[[LOOP]] ], [ [[BC_RESUME_VAL10]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[RED:%.*]] = phi double [ [[RED_NEXT:%.*]], %[[LOOP]] ], [ [[BC_MERGE_RDX]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[PTR0_NEXT]] = getelementptr i8, ptr [[PTR0]], i64 16
+; CHECK-NEXT: [[PTR1_NEXT]] = getelementptr i8, ptr [[PTR1]], i64 8
+; CHECK-NEXT: [[LOAD0:%.*]] = load fp128, ptr [[PTR0]], align 16
+; CHECK-NEXT: [[LOAD1:%.*]] = load double, ptr [[PTR1]], align 16
+; CHECK-NEXT: [[TRUNC:%.*]] = fptrunc fp128 [[LOAD0]] to double
+; CHECK-NEXT: [[RED_NEXT]] = tail call double @llvm.fmuladd.f64(double [[TRUNC]], double [[LOAD1]], double [[RED]])
+; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 1
+; CHECK-NEXT: [[CMP1_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[N]]
+; CHECK-NEXT: br i1 [[CMP1_NOT]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK: [[EXIT]]:
+; CHECK-NEXT: [[LCSSA:%.*]] = phi double [ [[RED_NEXT]], %[[LOOP]] ], [ [[TMP29]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT: ret double [[LCSSA]]
+;
+entry:
+ br label %loop
+
+loop:
+ %ptr0 = phi ptr [ %ptr0.next, %loop ], [ %start0, %entry ]
+ %ptr1 = phi ptr [ %ptr1.next, %loop ], [ %start1, %entry ]
+ %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+ %red = phi double [ %red.next, %loop ], [ %x, %entry ]
+ %ptr0.next = getelementptr i8, ptr %ptr0, i64 16
+ %ptr1.next = getelementptr i8, ptr %ptr1, i64 8
+ %load0 = load fp128, ptr %ptr0, align 16
+ %load1 = load double, ptr %ptr1, align 16
+ %trunc = fptrunc fp128 %load0 to double
+ %red.next = tail call double @llvm.fmuladd.f64(double %trunc, double %load1, double %red)
+ %iv.next = add i64 %iv, 1
+ %cmp1.not = icmp eq i64 %iv.next, %n
+ br i1 %cmp1.not, label %exit, label %loop
+
+exit:
+ %lcssa = phi double [ %red.next, %loop ]
+ ret double %lcssa
+}
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META1]]}
+;.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
@@ -0,0 +1,113 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 | |||
; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v2 -p loop-vectorize %s -S | FileCheck %s | |||
define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) { | |
define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) { |
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure.
c8b0987
to
412c1d8
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/16831 Here is the relevant piece of the build log for the reference
|
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change.
Note that without the fp128 load and trunc, there is no failure.
Pre-commit test for #113903.