Skip to content

-polly-opt-isl -polly-parallel crashes for certain tile sizes #32500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
llvmbot opened this issue May 24, 2017 · 10 comments
Closed

-polly-opt-isl -polly-parallel crashes for certain tile sizes #32500

llvmbot opened this issue May 24, 2017 · 10 comments
Labels
bugzilla Issues migrated from bugzilla polly

Comments

@llvmbot
Copy link
Member

llvmbot commented May 24, 2017

Bugzilla Link 33153
Resolution FIXED
Resolved on Nov 15, 2018 20:25
Version unspecified
OS Linux
Blocks #38454
Attachments nested_par.ll
Reporter LLVM Bugzilla Contributor
CC @Meinersbur,@tobiasgrosser,@tstellar
Fixed by commit(s) r343212 r347024

Extended Description

For certain tile sizes, -polly-opt-isl generates nested OpenMP parallel loops as below. Polly doesn't seem to correctly generate code to pack values in struct that is being passed to the nested polly_subfn.

In IslNodeBuilder::createForParallel, when the code generation continues in the nested subfunctions, the values that are being packed into the struct still refer to their old values of the parent function, but they should refer to the new Values of the current subfunction.

One fix is to replace those old values which have a corresponding new values stored in the IslNodeBuilder::ValueMap, while generating this code.

Is this a correct way to handle nested OpenMP parallel loops?

opt -S -basicaa -polly-process-unprofitable -polly-ignore-aliasing -polly-opt-isl -polly-vectorizer=polly --polly-parallel -polly-pattern-matching-based-opts=false --polly-tile-sizes=11,32,32 -polly-codegen -polly-codegen-verify=true -debug-only=polly-ast nested_par.ll
:: isl ast :: simple_test :: %entry.split---%for.end10
[p_0, p_1] -> { : -2147483648 <= p_0 <= 2147483647 and -2147483648 <= p_1 <= 2147483647 }
{ domain: "[p_0, p_1] -> { Stmt4[i0, i1] : 0 <= i0 < p_0 and 0 <= i1 < p_1 }", child: { mark: "1st level tiling - Tiles", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(floor((i0)/11))] }, { Stmt4[i0, i1] -> [(floor((i1)/32))] }]", permutable: 1, coincident: [ 1, 0 ], child: { mark: "1st level tiling - Points", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(-3floor((i0)/11) + floor((3i0)/11))] }]", permutable: 1, coincident: [ 1 ], options: "[p_0, p_1] -> { isolate[[i0, i1] -> [i2]] : i0 >= 0 and i1 >= 0 and 32i1 < p_1 and 0 <= i2 <= 1 and 11i2 <= -11 + 3p_0 - 33i0; atomic[0] }", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(i1 - 32floor((i1)/32))] }]", permutable: 1, child: { mark: "SIMD", child: { schedule: "[p_0, p_1] -> [{ Stmt4[i0, i1] -> [(i0 + floor((i0)/11) - 4*floor((3i0)/11))] }]", permutable: 1, coincident: [ 1 ] } } } } } } } }
if (p_0 <= 100)

// 1st level tiling - Tiles
#pragma omp parallel for
for (int c0 = 0; c0 <= floord(p_0 - 1, 11); c0 += 1)
  for (int c1 = 0; c1 <= floord(p_1 - 1, 32); c1 += 1) {
    // 1st level tiling - Points
    {
      for (int c2 = 0; c2 <= min(1, -3 * c0 + 3 * p_0 / 11 - 1); c2 += 1)
        for (int c3 = 0; c3 <= min(31, p_1 - 32 * c1 - 1); c3 += 1) {
          // SIMD
          for (int c4 = 0; c4 <= 3; c4 += 1)
            Stmt4(11 * c0 + 4 * c2 + c4, 32 * c1 + c3);
        }
      if (p_0 >= 11 * c0 + 9 || 3 * p_0 % 11 >= 3)
        #pragma omp parallel for
        for (int c2 = 0; c2 <= min(2, -3 * c0 + (3 * p_0 - 3) / 11); c2 += 1)
          for (int c3 = 0; c3 <= min(31, p_1 - 32 * c1 - 1); c3 += 1) {
            // SIMD
            if (c2 <= 1 && 33 * c0 + 11 * c2 + 10 >= 3 * p_0) {
              for (int c4 = 0; c4 < p_0 - 11 * c0 - 4 * c2; c4 += 1)
                Stmt4(11 * c0 + 4 * c2 + c4, 32 * c1 + c3);
            } else if (c2 == 2) {
              for (int c4 = 0; c4 <= min(2, p_0 - 11 * c0 - 9); c4 += 1)
                Stmt4(11 * c0 + c4 + 8, 32 * c1 + c3);
            }
          }
    }
  }

else
{ /* original code */ }

Referring to an argument in another function!
store [100 x float]* %data, [100 x float]** %polly.subfn.storeaddr.data
Referring to an argument in another function!
store float* %mean, float** %polly.subfn.storeaddr.mean
Referring to an argument in another function!
store i32 %n, i32* %polly.subfn.storeaddr.n
Polly generated function could not be verified. Add -polly-codegen-verify=false to disable this assertion.
UNREACHABLE executed at /home/bmohan/data/llvm_git/tools/polly/lib/CodeGen/CodeGeneration.cpp:89!
#​0 0x00000000023d9a84 (opt+0x23d9a84)
#​1 0x00000000023d9b17 (opt+0x23d9b17)
#​2 0x00000000023d8044 (opt+0x23d8044)
#​3 0x00000000023d93fc (opt+0x23d93fc)
#​4 0x00007fe2a9de9fe0 __restore_rt (/usr/lib/libpthread.so.0+0x11fe0)
#​5 0x00007fe2a8926a10 __GI_raise (/usr/lib/libc.so.6+0x33a10)
#​6 0x00007fe2a892813a __GI_abort (/usr/lib/libc.so.6+0x3513a)
#​7 0x0000000002362b73 (opt+0x2362b73)
#​8 0x000000000275b7f1 (opt+0x275b7f1)
#​9 0x000000000275c210 (opt+0x275c210)
#​10 0x0000000002751a49 (opt+0x2751a49)
#​11 0x0000000001682df5 (opt+0x1682df5)
#​12 0x0000000001cf7daf (opt+0x1cf7daf)
#​13 0x0000000001cf7f26 (opt+0x1cf7f26)
#​14 0x0000000001cf8273 (opt+0x1cf8273)
#​15 0x0000000001cf8927 (opt+0x1cf8927)
#​16 0x0000000001cf8b1f (opt+0x1cf8b1f)
#​17 0x000000000106757a (opt+0x106757a)
#​18 0x00007fe2a8913511 __libc_start_main (/usr/lib/libc.so.6+0x20511)
#​19 0x000000000104b98a (opt+0x104b98a)
Stack dump:
0. Program arguments: opt -S -basicaa -polly-process-unprofitable -polly-ignore-aliasing -polly-opt-isl -polly-vectorizer=polly --polly-parallel -polly-codegen-verify=false -polly-pattern-matching-based-opts=false --polly-tile-sizes=11,32,32 -polly-codegen -polly-codegen-verify=true -debug-only=polly-ast /home/bmohan/Downloads/polly_tiling_test/nested_par.ll

  1.  Running pass 'Function Pass Manager' on module '/home/bmohan/Downloads/polly_tiling_test/nested_par.ll'.
    
  2.  Running pass 'Region Pass Manager' on function '@simple_test'
    
  3.  Running pass 'Polly - Create LLVM-IR from SCoPs' on basic block '%entry.split'
    
@Meinersbur
Copy link
Member

You are right, some value is not being added to the subfunction's argument list. In this case it is %data, which is a parent function's argument and is used as a base pointer.

In this case we have a parallel loop within a parallel loop, which doesn't look like properly supported. While the middle subfunction generates the arguments for the innermist subfunction, it takes the value from the original function.

@llvmbot
Copy link
Member Author

llvmbot commented May 24, 2017

Hi Michael,

I have submitted the following patch which addresses this issue. Could you have a look at it to check if that suffices?

https://reviews.llvm.org/D33523

@tobiasgrosser
Copy link
Contributor

I am surprised that we generate nested parallelism at all. At least, this was not the original intention.

@Meinersbur
Copy link
Member

*** Bug llvm/llvm-bugzilla-archive#38073 has been marked as a duplicate of this bug. ***

@Meinersbur
Copy link
Member

Fixed in r343212.

@tobiasgrosser
Copy link
Contributor

Thank you Michael!

@tstellar
Copy link
Collaborator

Tobias, is this OK to merge to the release_70 branch?

@tobiasgrosser
Copy link
Contributor

This should be fine, yes.

@Meinersbur
Copy link
Member

mentioned in issue llvm/llvm-bugzilla-archive#38073

@tstellar
Copy link
Collaborator

mentioned in issue #38454

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla polly
Projects
None yet
Development

No branches or pull requests

4 participants