Skip to content

Commit 890236b

Browse files
committed
[AMDGPU] Push amdgpu-preload-kern-arg-prolog after livedebugvalues
This is effectively a workaround for a bug in livedebugvalues, but seems to potentially be a general improvement, as BB sections seems like it could ruin the special 256-byte prelude scheme that amdgpu-preload-kern-arg-prolog requires anyway. Moving it even later doesn't seem to have any material impact, and just adds livedebugvalues to the list of things which no longer have to deal with pseudo multiple-entry functions. AMDGPU debug-info isn't supported upstream yet, so the bug being avoided isn't testable here. I am posting the patch upstream to avoid an unnecessary diff with AMD's fork.
1 parent 07f4086 commit 890236b

File tree

2 files changed

+11
-5
lines changed

2 files changed

+11
-5
lines changed

llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1147,6 +1147,7 @@ class GCNPassConfig final : public AMDGPUPassConfig {
11471147
void addPostRegAlloc() override;
11481148
void addPreSched2() override;
11491149
void addPreEmitPass() override;
1150+
void addPostBBSections() override;
11501151
};
11511152

11521153
} // end anonymous namespace
@@ -1686,6 +1687,11 @@ void GCNPassConfig::addPreEmitPass() {
16861687
addPass(&AMDGPUInsertDelayAluID);
16871688

16881689
addPass(&BranchRelaxationPassID);
1690+
}
1691+
1692+
void GCNPassConfig::addPostBBSections() {
1693+
// We run this later to avoid passes like livedebugvalues and BBSections
1694+
// having to deal with the apparent multi-entry functions we may generate.
16891695
addPass(createAMDGPUPreloadKernArgPrologLegacyPass());
16901696
}
16911697

llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -145,11 +145,11 @@
145145
; GCN-O0-NEXT: Post RA hazard recognizer
146146
; GCN-O0-NEXT: AMDGPU Insert waits for SGPR read hazards
147147
; GCN-O0-NEXT: Branch relaxation pass
148-
; GCN-O0-NEXT: AMDGPU Preload Kernel Arguments Prolog
149148
; GCN-O0-NEXT: Register Usage Information Collector Pass
150149
; GCN-O0-NEXT: Remove Loads Into Fake Uses
151150
; GCN-O0-NEXT: Live DEBUG_VALUE analysis
152151
; GCN-O0-NEXT: Machine Sanitizer Binary Metadata
152+
; GCN-O0-NEXT: AMDGPU Preload Kernel Arguments Prolog
153153
; GCN-O0-NEXT: Lazy Machine Block Frequency Analysis
154154
; GCN-O0-NEXT: Machine Optimization Remark Emitter
155155
; GCN-O0-NEXT: Stack Frame Layout Analysis
@@ -430,11 +430,11 @@
430430
; GCN-O1-NEXT: AMDGPU Insert waits for SGPR read hazards
431431
; GCN-O1-NEXT: AMDGPU Insert Delay ALU
432432
; GCN-O1-NEXT: Branch relaxation pass
433-
; GCN-O1-NEXT: AMDGPU Preload Kernel Arguments Prolog
434433
; GCN-O1-NEXT: Register Usage Information Collector Pass
435434
; GCN-O1-NEXT: Remove Loads Into Fake Uses
436435
; GCN-O1-NEXT: Live DEBUG_VALUE analysis
437436
; GCN-O1-NEXT: Machine Sanitizer Binary Metadata
437+
; GCN-O1-NEXT: AMDGPU Preload Kernel Arguments Prolog
438438
; GCN-O1-NEXT: Lazy Machine Block Frequency Analysis
439439
; GCN-O1-NEXT: Machine Optimization Remark Emitter
440440
; GCN-O1-NEXT: Stack Frame Layout Analysis
@@ -743,11 +743,11 @@
743743
; GCN-O1-OPTS-NEXT: AMDGPU Insert waits for SGPR read hazards
744744
; GCN-O1-OPTS-NEXT: AMDGPU Insert Delay ALU
745745
; GCN-O1-OPTS-NEXT: Branch relaxation pass
746-
; GCN-O1-OPTS-NEXT: AMDGPU Preload Kernel Arguments Prolog
747746
; GCN-O1-OPTS-NEXT: Register Usage Information Collector Pass
748747
; GCN-O1-OPTS-NEXT: Remove Loads Into Fake Uses
749748
; GCN-O1-OPTS-NEXT: Live DEBUG_VALUE analysis
750749
; GCN-O1-OPTS-NEXT: Machine Sanitizer Binary Metadata
750+
; GCN-O1-OPTS-NEXT: AMDGPU Preload Kernel Arguments Prolog
751751
; GCN-O1-OPTS-NEXT: Lazy Machine Block Frequency Analysis
752752
; GCN-O1-OPTS-NEXT: Machine Optimization Remark Emitter
753753
; GCN-O1-OPTS-NEXT: Stack Frame Layout Analysis
@@ -1062,11 +1062,11 @@
10621062
; GCN-O2-NEXT: AMDGPU Insert waits for SGPR read hazards
10631063
; GCN-O2-NEXT: AMDGPU Insert Delay ALU
10641064
; GCN-O2-NEXT: Branch relaxation pass
1065-
; GCN-O2-NEXT: AMDGPU Preload Kernel Arguments Prolog
10661065
; GCN-O2-NEXT: Register Usage Information Collector Pass
10671066
; GCN-O2-NEXT: Remove Loads Into Fake Uses
10681067
; GCN-O2-NEXT: Live DEBUG_VALUE analysis
10691068
; GCN-O2-NEXT: Machine Sanitizer Binary Metadata
1069+
; GCN-O2-NEXT: AMDGPU Preload Kernel Arguments Prolog
10701070
; GCN-O2-NEXT: Lazy Machine Block Frequency Analysis
10711071
; GCN-O2-NEXT: Machine Optimization Remark Emitter
10721072
; GCN-O2-NEXT: Stack Frame Layout Analysis
@@ -1394,11 +1394,11 @@
13941394
; GCN-O3-NEXT: AMDGPU Insert waits for SGPR read hazards
13951395
; GCN-O3-NEXT: AMDGPU Insert Delay ALU
13961396
; GCN-O3-NEXT: Branch relaxation pass
1397-
; GCN-O3-NEXT: AMDGPU Preload Kernel Arguments Prolog
13981397
; GCN-O3-NEXT: Register Usage Information Collector Pass
13991398
; GCN-O3-NEXT: Remove Loads Into Fake Uses
14001399
; GCN-O3-NEXT: Live DEBUG_VALUE analysis
14011400
; GCN-O3-NEXT: Machine Sanitizer Binary Metadata
1401+
; GCN-O3-NEXT: AMDGPU Preload Kernel Arguments Prolog
14021402
; GCN-O3-NEXT: Lazy Machine Block Frequency Analysis
14031403
; GCN-O3-NEXT: Machine Optimization Remark Emitter
14041404
; GCN-O3-NEXT: Stack Frame Layout Analysis

0 commit comments

Comments
 (0)