-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Description
Bugzilla Link | 33153 |
Resolution | FIXED |
Resolved on | Nov 15, 2018 20:25 |
Version | unspecified |
OS | Linux |
Blocks | #38454 |
Attachments | nested_par.ll |
Reporter | LLVM Bugzilla Contributor |
CC | @Meinersbur,@tobiasgrosser,@tstellar |
Fixed by commit(s) | r343212 r347024 |
Extended Description
For certain tile sizes, -polly-opt-isl generates nested OpenMP parallel loops as below. Polly doesn't seem to correctly generate code to pack values in struct that is being passed to the nested polly_subfn.
In IslNodeBuilder::createForParallel, when the code generation continues in the nested subfunctions, the values that are being packed into the struct still refer to their old values of the parent function, but they should refer to the new Values of the current subfunction.
One fix is to replace those old values which have a corresponding new values stored in the IslNodeBuilder::ValueMap, while generating this code.
Is this a correct way to handle nested OpenMP parallel loops?
opt -S -basicaa -polly-process-unprofitable -polly-ignore-aliasing -polly-opt-isl -polly-vectorizer=polly --polly-parallel -polly-pattern-matching-based-opts=false --polly-tile-sizes=11,32,32 -polly-codegen -polly-codegen-verify=true -debug-only=polly-ast nested_par.ll
:: isl ast :: simple_test :: %entry.split---%for.end10
[p_0, p_1] -> { : -2147483648 <= p_0 <= 2147483647 and -2147483648 <= p_1 <= 2147483647 }
{ domain: "[p_0, p_1] -> { Stmt4[i0, i1] : 0 <= i0 < p_0 and 0 <= i1 < p_1 }", child: { mark: "1st level tiling - Tiles", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(floor((i0)/11))] }, { Stmt4[i0, i1] -> [(floor((i1)/32))] }]", permutable: 1, coincident: [ 1, 0 ], child: { mark: "1st level tiling - Points", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(-3floor((i0)/11) + floor((3i0)/11))] }]", permutable: 1, coincident: [ 1 ], options: "[p_0, p_1] -> { isolate[[i0, i1] -> [i2]] : i0 >= 0 and i1 >= 0 and 32i1 < p_1 and 0 <= i2 <= 1 and 11i2 <= -11 + 3p_0 - 33i0; atomic[0] }", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(i1 - 32floor((i1)/32))] }]", permutable: 1, child: { mark: "SIMD", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(i0 + floor((i0)/11) - 4*floor((3i0)/11))] }]", permutable: 1, coincident: [ 1 ] } } } } } } } }
if (p_0 <= 100)
// 1st level tiling - Tiles
#pragma omp parallel for
for (int c0 = 0; c0 <= floord(p_0 - 1, 11); c0 += 1)
for (int c1 = 0; c1 <= floord(p_1 - 1, 32); c1 += 1) {
// 1st level tiling - Points
{
for (int c2 = 0; c2 <= min(1, -3 * c0 + 3 * p_0 / 11 - 1); c2 += 1)
for (int c3 = 0; c3 <= min(31, p_1 - 32 * c1 - 1); c3 += 1) {
// SIMD
for (int c4 = 0; c4 <= 3; c4 += 1)
Stmt4(11 * c0 + 4 * c2 + c4, 32 * c1 + c3);
}
if (p_0 >= 11 * c0 + 9 || 3 * p_0 % 11 >= 3)
#pragma omp parallel for
for (int c2 = 0; c2 <= min(2, -3 * c0 + (3 * p_0 - 3) / 11); c2 += 1)
for (int c3 = 0; c3 <= min(31, p_1 - 32 * c1 - 1); c3 += 1) {
// SIMD
if (c2 <= 1 && 33 * c0 + 11 * c2 + 10 >= 3 * p_0) {
for (int c4 = 0; c4 < p_0 - 11 * c0 - 4 * c2; c4 += 1)
Stmt4(11 * c0 + 4 * c2 + c4, 32 * c1 + c3);
} else if (c2 == 2) {
for (int c4 = 0; c4 <= min(2, p_0 - 11 * c0 - 9); c4 += 1)
Stmt4(11 * c0 + c4 + 8, 32 * c1 + c3);
}
}
}
}
else
{ /* original code */ }
Referring to an argument in another function!
store [100 x float]* %data, [100 x float]** %polly.subfn.storeaddr.data
Referring to an argument in another function!
store float* %mean, float** %polly.subfn.storeaddr.mean
Referring to an argument in another function!
store i32 %n, i32* %polly.subfn.storeaddr.n
Polly generated function could not be verified. Add -polly-codegen-verify=false to disable this assertion.
UNREACHABLE executed at /home/bmohan/data/llvm_git/tools/polly/lib/CodeGen/CodeGeneration.cpp:89!
#0 0x00000000023d9a84 (opt+0x23d9a84)
#1 0x00000000023d9b17 (opt+0x23d9b17)
#2 0x00000000023d8044 (opt+0x23d8044)
#3 0x00000000023d93fc (opt+0x23d93fc)
#4 0x00007fe2a9de9fe0 __restore_rt (/usr/lib/libpthread.so.0+0x11fe0)
#5 0x00007fe2a8926a10 __GI_raise (/usr/lib/libc.so.6+0x33a10)
#6 0x00007fe2a892813a __GI_abort (/usr/lib/libc.so.6+0x3513a)
#7 0x0000000002362b73 (opt+0x2362b73)
#8 0x000000000275b7f1 (opt+0x275b7f1)
#9 0x000000000275c210 (opt+0x275c210)
#10 0x0000000002751a49 (opt+0x2751a49)
#11 0x0000000001682df5 (opt+0x1682df5)
#12 0x0000000001cf7daf (opt+0x1cf7daf)
#13 0x0000000001cf7f26 (opt+0x1cf7f26)
#14 0x0000000001cf8273 (opt+0x1cf8273)
#15 0x0000000001cf8927 (opt+0x1cf8927)
#16 0x0000000001cf8b1f (opt+0x1cf8b1f)
#17 0x000000000106757a (opt+0x106757a)
#18 0x00007fe2a8913511 __libc_start_main (/usr/lib/libc.so.6+0x20511)
#19 0x000000000104b98a (opt+0x104b98a)
Stack dump:
0. Program arguments: opt -S -basicaa -polly-process-unprofitable -polly-ignore-aliasing -polly-opt-isl -polly-vectorizer=polly --polly-parallel -polly-codegen-verify=false -polly-pattern-matching-based-opts=false --polly-tile-sizes=11,32,32 -polly-codegen -polly-codegen-verify=true -debug-only=polly-ast /home/bmohan/Downloads/polly_tiling_test/nested_par.ll
-
Running pass 'Function Pass Manager' on module '/home/bmohan/Downloads/polly_tiling_test/nested_par.ll'.
-
Running pass 'Region Pass Manager' on function '@simple_test'
-
Running pass 'Polly - Create LLVM-IR from SCoPs' on basic block '%entry.split'