Skip to content

[InstCombine] Instruction sink doesn't see through sink opportunities from bb0 to bb2 in bb0->bb1->bb2 #88960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mingmingl-llvm opened this issue Apr 16, 2024 · 4 comments

Comments

@mingmingl-llvm
Copy link
Contributor

see https://gcc.godbolt.org/z/G7Paj7h37 for two test cases.

  • in function @inst_sink, the two instructions (load, icmp) in the entry block sink to the block where it is used.
  • in function @inst_not_sink, the two instructions stay in the entry block although they are used only in if.false.orig_indirect2
@mingmingl-llvm
Copy link
Contributor Author

@inst_not_sink itself is the output after some transformation. When interleaving the transformation with InstCombinerImpl::tryToSinkInstruction (in a draft change) and exposing the opportunity as one hop (bb0->bb1, bb1->bb2 but not bb0->bb1->bb2), the un-used load and icmp could sink out of entry block into where they are used.

@mingmingl-llvm
Copy link
Contributor Author

@nikic @goldsteinn @dtcxzyw @vfdff

To optimize the second function (@inst_not_sink) and sink {gep, icmp} to where they are used, I want to interleave the sink and other transformation.

InstCombinerImpl::tryToSinkInstruction handles the sink of instructions and debug info pretty well. May I get your thoughts on moving the core logic into a standalone util library for re-use in other optimizer passes?

@nikic
Copy link
Contributor

nikic commented Apr 18, 2024

@minglotus-6 I don't really understand what you're proposing, but the most likely answer to this is going to be "no" -- we have about reached (or even exceeded) what we can do with ad-hoc sinking, as opposed to a dedicated sinking pass.

@mingmingl-llvm
Copy link
Contributor Author

mingmingl-llvm commented Apr 18, 2024

@minglotus-6 I don't really understand what you're proposing, but the most likely answer to this is going to be "no" -- we have about reached (or even exceeded) what we can do with ad-hoc sinking, as opposed to a dedicated sinking pass.

@nikic Many thanks for the response!

I'd like to make InstCombinerImpl::trySinkInstruction and its two callee functions (tryToSinkInstructionDbgValues and tryToSinkInstructionDbgVariableRecords) a utility function. This way, other passes can re-use the utility to sink instructions, without forking off the core logic of InstCombinerImpl::trySinkInstruction (bail out if sink is not correct, or handle debug information sink properly).

My main question is, what do you think of extracting InstCombinerImpl::trySinkInstruction as a helper function? The goal is for other IR passes to re-use the logic when a certain IR pass knows the opportunity is there.

As an example use case, say pgo-call-prom pass [1] gets [2] as an intermediate output (without giving it to the next pass), it could call the utility function to get [3]. Then this pass wants proceed to further transform [3] into [4], call sink utility function again to get [5].

  • If I don't interleave the sink utility function with the transformations happen within a pass, pog-icall-prom pass will generate sub-optimal code, since it'd be too much work (e.g., compile time cost) for the other pass to analyze the IR to detect the sink opportunity.

[1] As you may well already know, it conditionally devirtualizes a function calls
[2]

# assume entry is a hot block
entry:
  %vtable = load ptr, ptr %d
  %vfn = getelementptr ptr, ptr %vtable, i64 1
  %2 = load ptr, ptr %vfn
  %3 = icmp eq ptr %vtable, %constant
  br i1 %3, label %bb1.hot, label %bb2, !prof !4

bb1.hot:                             
  %5 = tail call i32 @func(ptr %d)
  br label %if.end.icp

bb2:                          
  %call = tail call i32 %2(ptr %d)
  br label %bb3

bb3:                                      
  %8 = phi i32 [ %call, %bb2 ], [ %5, %bb1.hot ]
  ret i32 %8

[3]

entry:
  %vtable = load ptr, ptr %d
  %3 = icmp eq ptr %vtable, %constant
  br i1 %3, label %bb1.hot, label %bb2, !prof !4

bb1.hot:                             
  %5 = tail call i32 @func(ptr %d)
  br label %if.end.icp

bb2:                          
  %vfn = getelementptr ptr, ptr %vtable, i64 1
  %2 = load ptr, ptr %vfn
  %call = tail call i32 %2(ptr %d)
  br label %bb3

bb3:                                      
  %8 = phi i32 [ %call, %bb2 ], [ %5, %bb1.hot ]
  ret i32 %8

[4]

entry:
  %vtable = load ptr, ptr %d
  %3 = icmp eq ptr %vtable, %constant
  br i1 %3, label %bb1.hot, label %bb2, !prof !4

bb1.hot:                              
  %5 = tail call i32 @func(ptr %d)
  br label %bb6

bb2:    
  %vfn = getelementptr ptr, ptr %vtable, i64 1
  %2 = load ptr, ptr %vfn, align 8                       
  %6 = icmp eq ptr %vtable, %constant2
  br i1 %6, label %bb3, label %bb4

bb3:                            
  %7 = tail call i32 @another-func(ptr %d)
  br label %bb5

bb4.cold:                      
  %call = tail call i32 %2(ptr %d)
  br label %bb5

bb5:                                    
  %8 = phi i32 [ %call, %bb4.cold ], [ %7, %bb3 ]
  br label %bb6

bb6:                                      
  %9 = phi i32 [ %8, %bb5], [ %5, %bb1.hot ]
  ret i32 %9
}

[5]


entry:
  %vtable = load ptr, ptr %d
  %3 = icmp eq ptr %vtable, %constant
  br i1 %3, label %bb1, label %bb2, !prof !4

bb1.hot:                              
  %5 = tail call i32 @func(ptr %d)
  br label %bb6

bb2.hot:               
  %6 = icmp eq ptr %vtable, %constant2
  br i1 %6, label %bb3.hot, label %bb4.cold

bb3.hot:                            
  %7 = tail call i32 @another-func(ptr %d)
  br label %bb5

bb4.cold:        
   %vfn = getelementptr ptr, ptr %vtable, i64 1
  %2 = load ptr, ptr %vfn, align 8                          
  %call = tail call i32 %2(ptr %d)
  br label %bb5

bb5:                                    
  %8 = phi i32 [ %call, %bb4.cold ], [ %7, %bb3 ]
  br label %bb6

bb6:                                      
  %9 = phi i32 [ %8, %bb5], [ %5, %bb1.hot ]
  ret i32 %9
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants