-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[DWARF] Emit a worst-case prologue_end flag for pathological inputs #107849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
prologue_end usually indicates where the end of the function-initialization lies, and is where debuggers usually choose to put the initial breakpoint for a function. Our current algorithm piggy-backs it on the first available source-location: which doesn't necessarily have anything to do with the start of the function. To avoid this in heavily-optimised code that lacks many useful source locations, pick a worst-case "if all else fails" prologue_end location, of the first instruction that appears to do meaningful computation. It'll be given the function-scope line number, which should run-on from the start of the function anyway. This means if your code is completely inverted by the optimiser, you can at least put a breakpoint at the _start_ like you expect, even if it's difficult to then step through.
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-debuginfo Author: Jeremy Morse (jmorse) ChangesHi, (Maybe this should be in an RFC if it's properly contentious,) Problem: sometimes the entry blocks of functions contain no source-locations at all. Currently, prologue_end is attached to the first source-location found when iterating over the function, which don't have to be in the entry block or anywhere near it. It can even be the return instruction. IMO: this isn't beneficial and can be misleading, the spec says it's "where execution should be suspended for a breakpoint at the entry of a function". If the prologue_end point is past meaningful computation, or past the entry block, the developer has probably missed something useful. These situations do pop up occasionally: I've got a few game code samples that have been heavily LTO'd, and a non-trivial number of functions end up with a) a lot of information loss after inlining and de-duplication, and b) a prologue_end set after various memory operations and conditional branches. Having to debug LTO'd binaries is already a trial, making it harder to set effective breakpoints doesn't help the developer. Thus, I'd like to merge this patch which, if there aren't any source locations in the entry block, picks the first non-trivial instruction to apply the scope-line and prologue_end to as a "worst case" or "backup" entry point. See the added dbg-prolog-end-backup-loc.ll for an illustrated example. However, nothing is ever that simple and I've found myself in the tar pit here, with numerous existing behaviours to support:
In terms of test effects:
A test that can't be fixed right now is CodeGen/Thumb2/pr52817.ll, where we put prologue_end half way through substantive computation. This is because it looks a lot like an unoptimized piece of code where prologue_end should live on the first source-location after falling-out of the entry block. I fed this to the GDB test suite; there weren't any further regressions past the ~121 test failures I had before this patch. Patch is 21.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107849.diff 8 Files Affected:
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 148b620c2b62b7..6830fd41b8ccce 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2061,6 +2061,16 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
unsigned LastAsmLine =
Asm->OutStreamer->getContext().getCurrentDwarfLoc().getLine();
+ if (!DL && MI == PrologEndLoc) {
+ // In rare situations, we might want to place the end of the prologue
+ // somewhere that doesn't have a source location already. It should be in
+ // the entry block.
+ assert(MI->getParent() == &*MI->getMF()->begin());
+ recordSourceLine(SP->getScopeLine(), 0, SP,
+ DWARF2_FLAG_PROLOGUE_END | DWARF2_FLAG_IS_STMT);
+ return;
+ }
+
bool PrevInstInDiffBB = PrevInstBB && PrevInstBB != MI->getParent();
if (DL == PrevInstLoc && !PrevInstInDiffBB) {
// If we have an ongoing unspecified location, nothing to do here.
@@ -2131,36 +2141,121 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
PrevInstLoc = DL;
}
-static std::pair<const MachineInstr *, bool>
+std::pair<const MachineInstr *, bool>
findPrologueEndLoc(const MachineFunction *MF) {
// First known non-DBG_VALUE and non-frame setup location marks
// the beginning of the function body.
- const MachineInstr *LineZeroLoc = nullptr;
+ const auto &TII = *MF->getSubtarget().getInstrInfo();
+ const MachineInstr *LineZeroLoc = nullptr, *NonTrivialInst = nullptr;
const Function &F = MF->getFunction();
// Some instructions may be inserted into prologue after this function. Must
// keep prologue for these cases.
bool IsEmptyPrologue =
!(F.hasPrologueData() || F.getMetadata(LLVMContext::MD_func_sanitize));
- for (const auto &MBB : *MF) {
- for (const auto &MI : MBB) {
- if (!MI.isMetaInstruction()) {
- if (!MI.getFlag(MachineInstr::FrameSetup) && MI.getDebugLoc()) {
- // Scan forward to try to find a non-zero line number. The
- // prologue_end marks the first breakpoint in the function after the
- // frame setup, and a compiler-generated line 0 location is not a
- // meaningful breakpoint. If none is found, return the first
- // location after the frame setup.
- if (MI.getDebugLoc().getLine())
- return std::make_pair(&MI, IsEmptyPrologue);
-
- LineZeroLoc = &MI;
- }
- IsEmptyPrologue = false;
- }
+
+ // Helper lambda to examine each instruction and potentially return it
+ // as the prologue_end point.
+ auto ExamineInst = [&](const MachineInstr &MI)
+ -> std::optional<std::pair<const MachineInstr *, bool>> {
+ // Is this instruction trivial data shuffling or frame-setup?
+ bool isCopy = (TII.isCopyInstr(MI) ? true : false);
+ bool isTrivRemat = TII.isTriviallyReMaterializable(MI);
+ bool isFrameSetup = MI.getFlag(MachineInstr::FrameSetup);
+
+ if (!isFrameSetup && MI.getDebugLoc()) {
+ // Scan forward to try to find a non-zero line number. The
+ // prologue_end marks the first breakpoint in the function after the
+ // frame setup, and a compiler-generated line 0 location is not a
+ // meaningful breakpoint. If none is found, return the first
+ // location after the frame setup.
+ if (MI.getDebugLoc().getLine())
+ return std::make_pair(&MI, IsEmptyPrologue);
+
+ LineZeroLoc = &MI;
+ }
+
+ // Keep track of the first "non-trivial" instruction seen, i.e. anything
+ // that doesn't involve shuffling data around or is a frame-setup.
+ if (!isCopy && !isTrivRemat && !isFrameSetup && !NonTrivialInst)
+ NonTrivialInst = &MI;
+
+ IsEmptyPrologue = false;
+ return std::nullopt;
+ };
+
+ // Examine all the instructions at the start of the function. This doesn't
+ // necessarily mean just the entry block: unoptimised code can fall-through
+ // into an initial loop, and it makes sense to put the initial breakpoint on
+ // the first instruction of such a loop. However, if we pass branches, we're
+ // better off synthesising an early prologue_end.
+ auto CurBlock = MF->begin();
+ auto CurInst = CurBlock->begin();
+ while (true) {
+ // Skip empty blocks, in rare cases the entry can be empty.
+ if (CurInst == CurBlock->end()) {
+ ++CurBlock;
+ CurInst = CurBlock->begin();
+ continue;
+ }
+
+ // Check whether this non-meta instruction a good position for prologue_end.
+ if (!CurInst->isMetaInstruction()) {
+ auto FoundInst = ExamineInst(*CurInst);
+ if (FoundInst)
+ return *FoundInst;
+ }
+
+ // Try to continue searching, but use a backup-location if substantive
+ // computation is happening.
+ auto NextInst = std::next(CurInst);
+ if (NextInst == CurInst->getParent()->end()) {
+ // We've reached the end of the block. Did we just look at a terminator?
+ if (CurInst->isTerminator())
+ // Some kind of "real" control flow is occurring. At the very least
+ // we would have to start exploring the CFG, a good signal that the
+ // prologue is over.
+ break;
+
+ // If we've already fallen through into a loop, don't fall through
+ // further, use a backup-location.
+ if (CurBlock->pred_size() > 1)
+ break;
+
+ // Fall-through from entry to the next block. This is common at -O0 when
+ // there's no initialisation in the function.
+ auto NextBBIter = std::next(CurInst->getParent()->getIterator());
+ // Bail if we're also at the end of the function.
+ if (NextBBIter == MF->end())
+ break;
+ CurBlock = NextBBIter;
+ CurInst = NextBBIter->begin();
+ } else {
+ // Continue examining the current block.
+ CurInst = NextInst;
}
}
- return std::make_pair(LineZeroLoc, IsEmptyPrologue);
+
+ // We didn't find a "good" prologue_end, so consider backup locations.
+ // Was there an empty-line location? Return that, and it'll have the
+ // scope-line "flow" into it when it becomes the prologue_end position.
+ if (LineZeroLoc)
+ return std::make_pair(LineZeroLoc, IsEmptyPrologue);
+
+ // We couldn't find any source-location, suggesting all meaningful information
+ // got optimised away. Set the prologue_end to be the first non-trivial
+ // instruction, which will get the scope line number. This is better than
+ // nothing.
+ // Only do this in the entry block, as we'll be giving it the scope line for
+ // the function. Return IsEmptyPrologue==true if we've picked the first
+ // instruction.
+ if (NonTrivialInst && NonTrivialInst->getParent() == &*MF->begin()) {
+ IsEmptyPrologue = NonTrivialInst == &*MF->begin()->begin();
+ return std::make_pair(NonTrivialInst, IsEmptyPrologue);
+ }
+
+ // If the entry path is empty, just don't have a prologue_end at all.
+ return std::make_pair(nullptr, IsEmptyPrologue);
}
/// Register a source line with debug info. Returns the unique label that was
diff --git a/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll b/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
index 19253d67c14945..63640747d2aa2c 100644
--- a/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
+++ b/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
@@ -203,9 +203,9 @@ declare void @llvm.dbg.value(metadata, metadata, metadata)
; locations in the function.
define double @foo1_g(float %p1, double %p2, double %p3) nounwind !dbg !4 {
; CHECK-LABEL: foo1_g:
-; CHECK: .file 1 "." "test.c"
-; CHECK-NEXT: .loc 1 3 0
; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: .file 1 "." "test.c"
+; CHECK-NEXT: .loc 1 3 0 prologue_end
; CHECK-NEXT: xorps %xmm3, %xmm3
; CHECK-NEXT: ucomiss %xmm3, %xmm0
; CHECK-NEXT: movsd {{.*#+}} xmm0 = [1.25E+0,0.0E+0]
diff --git a/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir b/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
new file mode 100644
index 00000000000000..378d579cee62e1
--- /dev/null
+++ b/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
@@ -0,0 +1,134 @@
+# RUN: llc %s -start-before=livedebugvalues -o - | \
+# RUN: FileCheck %s --implicit-check-not=prologue_end
+#
+## When picking a "backup" location of the first non-trivial instruction in
+## a function, don't select a location outside of the entry block. We have to
+## give it the functions scope-line, and installing that outside of the entry
+## block is liable to be misleading.
+##
+## Produced from the C below with "clang -O2 -g -mllvm
+## -stop-before=livedebugvalues", then modified to unrotate and shift early
+## insts into the loop block. This means the MIR is meaningless, we only test
+## whether the scope-line will leak into the loop block or not.
+##
+## int glob = 0;
+## int foo(int arg, int sum) {
+## arg += sum;
+## while (arg) {
+## glob--;
+## arg %= glob;
+## }
+## return 0;
+## }
+#
+# CHECK-LABEL: foo:
+# CHECK: .loc 0 2 0
+# CHECK: # %bb.0:
+# CHECK-NEXT: movl %edi, %edx
+# CHECK-NEXT: .loc 0 0 0 is_stmt 0
+# CHECK-NEXT: .Ltmp0:
+# CHECK-NEXT: .p2align 4, 0x90
+# CHECK-NEXT: .LBB0_1:
+# CHECK-LABEL: addl %esi, %edx
+
+
+--- |
+ ; ModuleID = 'out2.ll'
+ source_filename = "foo.c"
+ target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+ target triple = "x86_64-unknown-linux-gnu"
+
+ @glob = dso_local local_unnamed_addr global i32 0, align 4, !dbg !0
+
+ define dso_local noundef i32 @foo(i32 noundef %arg, i32 noundef %sum) local_unnamed_addr !dbg !9 {
+ entry:
+ %add = add nsw i32 %sum, %arg
+ br label %while.body.preheader
+
+ while.body.preheader: ; preds = %entry
+ %glob.promoted = load i32, ptr @glob, align 4
+ br label %while.body, !dbg !13
+
+ while.body: ; preds = %while.body, %while.body.preheader
+ %arg.addr.06 = phi i32 [ %rem, %while.body ], [ %add, %while.body.preheader ]
+ %dec35 = phi i32 [ %dec, %while.body ], [ %glob.promoted, %while.body.preheader ]
+ %dec = add nsw i32 %dec35, -1, !dbg !14
+ %0 = add i32 %dec35, -1, !dbg !16
+ %rem = srem i32 %arg.addr.06, %0, !dbg !16
+ %tobool.not = icmp eq i32 %rem, 0, !dbg !13
+ br i1 %tobool.not, label %while.cond.while.end_crit_edge, label %while.body, !dbg !13
+
+ while.cond.while.end_crit_edge: ; preds = %while.body
+ store i32 %dec, ptr @glob, align 4, !dbg !14
+ br label %while.end, !dbg !13
+
+ while.end: ; preds = %while.cond.while.end_crit_edge
+ ret i32 0, !dbg !17
+ }
+
+ !llvm.dbg.cu = !{!2}
+ !llvm.module.flags = !{!6, !7}
+ !llvm.ident = !{!8}
+
+ !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+ !1 = distinct !DIGlobalVariable(name: "glob", scope: !2, file: !3, line: 1, type: !5, isLocal: false, isDefinition: true)
+ !2 = distinct !DICompileUnit(language: DW_LANG_C11, file: !3, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+ !3 = !DIFile(filename: "foo.c", directory: "")
+ !4 = !{!0}
+ !5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+ !6 = !{i32 7, !"Dwarf Version", i32 5}
+ !7 = !{i32 2, !"Debug Info Version", i32 3}
+ !8 = !{!"clang"}
+ !9 = distinct !DISubprogram(name: "foo", scope: !3, file: !3, line: 2, type: !10, scopeLine: 2, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !12)
+ !10 = !DISubroutineType(types: !11)
+ !11 = !{!5, !5, !5}
+ !12 = !{}
+ !13 = !DILocation(line: 4, column: 3, scope: !9)
+ !14 = !DILocation(line: 5, column: 9, scope: !15)
+ !15 = distinct !DILexicalBlock(scope: !9, file: !3, line: 4, column: 15)
+ !16 = !DILocation(line: 6, column: 9, scope: !15)
+ !17 = !DILocation(line: 8, column: 3, scope: !9)
+
+...
+---
+name: foo
+alignment: 16
+tracksRegLiveness: true
+debugInstrRef: true
+tracksDebugUserValues: true
+liveins:
+ - { reg: '$edi' }
+ - { reg: '$esi' }
+frameInfo:
+ maxAlignment: 1
+ maxCallFrameSize: 0
+ isCalleeSavedInfoValid: true
+machineFunctionInfo:
+ amxProgModel: None
+body: |
+ bb.0.entry:
+ liveins: $edi, $esi
+
+ $edx = MOV32rr $edi
+
+ bb.1.while.body (align 16):
+ successors: %bb.2(0x04000000), %bb.1(0x7c000000)
+ liveins: $ecx, $edx, $esi
+
+ renamable $edx = nsw ADD32rr killed renamable $edx, renamable $esi, implicit-def dead $eflags
+ renamable $ecx = MOV32rm $rip, 1, $noreg, @glob, $noreg :: (dereferenceable load (s32) from @glob)
+ renamable $ecx = DEC32r killed renamable $ecx, implicit-def dead $eflags
+ $eax = MOV32rr killed $edx
+ CDQ implicit-def $eax, implicit-def $edx, implicit $eax
+ IDIV32r renamable $ecx, implicit-def dead $eax, implicit-def $edx, implicit-def dead $eflags, implicit $eax, implicit $edx
+ TEST32rr renamable $edx, renamable $edx, implicit-def $eflags
+ JCC_1 %bb.1, 5, implicit killed $eflags
+
+ bb.2.while.cond.while.end_crit_edge:
+ liveins: $ecx, $esi
+
+ MOV32mr $rip, 1, $noreg, @glob, $noreg, killed renamable $ecx, debug-location !14 :: (store (s32) into @glob)
+ $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, debug-location !17
+ RET64 $eax, debug-location !17
+
+...
diff --git a/llvm/test/DebugInfo/MIR/X86/empty-inline.mir b/llvm/test/DebugInfo/MIR/X86/empty-inline.mir
index 58775e8cd852fb..c489339b4c8aa6 100644
--- a/llvm/test/DebugInfo/MIR/X86/empty-inline.mir
+++ b/llvm/test/DebugInfo/MIR/X86/empty-inline.mir
@@ -13,11 +13,12 @@
#
# CHECK: Address Line Column File ISA Discriminator OpIndex Flags
# CHECK-NEXT: ---
-# CHECK-NEXT: 25 0 1 0 0 0 is_stmt
+# CHECK-NEXT: 25 0 1 0 0 0 is_stmt prologue_end
# CHECK-NEXT: 0 0 1 0 0 0
-# CHECK-NEXT: 29 28 1 0 0 0 is_stmt prologue_end
+# CHECK-NEXT: 29 28 1 0 0 0 is_stmt
# CHECK-NEXT: 29 28 1 0 0 0 is_stmt
# CHECK-NEXT: 29 28 1 0 0 0 is_stmt end_sequence
+
--- |
source_filename = "t.ll"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
diff --git a/llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll b/llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll
new file mode 100644
index 00000000000000..2425df608275c5
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll
@@ -0,0 +1,86 @@
+; RUN: llc %s -o - | FileCheck %s
+
+;; This test has had source-locations removed from the prologue, to simulate
+;; heavily-optimised scenarios where a lot of debug-info gets dropped. Check
+;; that we can pick a "worst-case" prologue_end position, of the first
+;; instruction that does any meaningful computation (the add). It's better to
+;; put the prologue_end flag here rather than deeper into the loop, past the
+;; early-exit check.
+;;
+;; Generated from this code at -O2 -g in clang, with source locations then
+;; deleted.
+;;
+;; int glob = 0;
+;; int foo(int arg, int sum) {
+;; arg += sum;
+;; while (arg) {
+;; glob--;
+;; arg %= glob;
+;; }
+;; return 0;
+;; }
+
+; CHECK-LABEL: foo:
+;; Scope-line location:
+; CHECK: .loc 0 2 0
+;; Entry block:
+; CHECK: movl %edi, %edx
+; CHECK-NEXT: .loc 0 2 0 prologue_end
+; CHECK-NEXT: addl %esi, %edx
+; CHECK-NEXT: je .LBB0_4
+; CHECK-LABEL: # %bb.1:
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+@glob = dso_local local_unnamed_addr global i32 0, align 4, !dbg !0
+
+define dso_local noundef i32 @foo(i32 noundef %arg, i32 noundef %sum) local_unnamed_addr !dbg !9 {
+entry:
+ %add = add nsw i32 %sum, %arg
+ %tobool.not4 = icmp eq i32 %add, 0
+ br i1 %tobool.not4, label %while.end, label %while.body.preheader
+
+while.body.preheader:
+ %glob.promoted = load i32, ptr @glob, align 4
+ br label %while.body, !dbg !14
+
+while.body:
+ %arg.addr.06 = phi i32 [ %rem, %while.body ], [ %add, %while.body.preheader ]
+ %dec35 = phi i32 [ %dec, %while.body ], [ %glob.promoted, %while.body.preheader ]
+ %dec = add nsw i32 %dec35, -1, !dbg !15
+ %rem = srem i32 %arg.addr.06, %dec, !dbg !17
+ %tobool.not = icmp eq i32 %rem, 0, !dbg !14
+ br i1 %tobool.not, label %while.cond.while.end_crit_edge, label %while.body, !dbg !14
+
+while.cond.while.end_crit_edge:
+ store i32 %dec, ptr @glob, align 4, !dbg !15
+ br label %while.end, !dbg !14
+
+while.end:
+ ret i32 0, !dbg !18
+}
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!6, !7}
+!llvm.ident = !{!8}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "glob", scope: !2, file: !3, line: 1, type: !5, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C11, file: !3, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "foo.c", directory: "")
+!4 = !{!0}
+!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!6 = !{i32 7, !"Dwarf Version", i32 5}
+!7 = !{i32 2, !"Debug Info Version", i32 3}
+!8 = !{!"clang"}
+!9 = distinct !DISubprogram(name: "foo", scope: !3, file: !3, line: 2, type: !10, scopeLine: 2, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !12)
+!10 = !DISubroutineType(types: !11)
+!11 = !{!5, !5, !5}
+!12 = !{}
+!13 = !DILocation(line: 3, column: 7, scope: !9)
+!14 = !DILocation(line: 4, column: 3, scope: !9)
+!15 = !DILocation(line: 5, column: 9, scope: !16)
+!16 = distinct !DILexicalBlock(scope: !9, file: !3, line: 4, column: 15)
+!17 = !DILocation(line: 6, column: 9, scope: !16)
+!18 = !DILocation(line: 8, column: 3, scope: !9)
diff --git a/llvm/test/DebugInfo/X86/dbg-prolog-end.ll b/llvm/test/DebugInfo/X86/dbg-prolog-end.ll
index 1703323fc7ee1d..3d29f8e266301f 100644
--- a/llvm/test/DebugInfo/X86/dbg-prolog-end.ll
+++ b/llvm/test/DebugInfo/X86/dbg-prolog-end.ll
@@ -36,6 +36,50 @@ entry:
ret i32 %call, !dbg !16
}
+;; int foo(int arg) {
+;; while (arg)
+;; arg--;
+;; return 0;
+;; }
+;;
+;; In this function, the entry block will fall through to while.cond, with no
+;; instructions having source-locations. The expectations at -O0 is that we'll
+;; put prologue_end on the first instruction of the loop, after %arg.addr is
+;; initialized.
+
+; CHECK: _bar:
+; CHECK-NEXT: Lfunc_begin2:
+; CHECK-NEXT: .loc 1 11 0 is_stmt 1
+; CHECK-NEXT: .cfi_startproc
+; CHECK-NEXT: ## %bb.0:
+; CHECK-NEXT: movl %edi, -4(%rsp)
+; CHECK-NEXT: LBB2_1:
+; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: Ltmp4:
+; CHECK-NEXT: .loc 1 12 3 prologue_end
+; CHECK-NEXT: cmpl $0, -4(%rsp)
+
+define dso_local i32 @bar(i32 noundef %arg) !dbg !30 {
+entry:
+ %arg.addr = alloca i32, align 4
+ store i32 %arg, ptr %arg.addr, align 4
+ br label %while.cond, !dbg !37
+
+while.cond: ; preds = %while.body, %entry
+ %0 = load i32, ptr %arg.addr, align 4, !dbg !38
+ %tobool = icmp ne i32 %0, 0, !dbg !37
+ br i1 %tobool, label %while.body, label %while.end, !dbg !37
+
+while.body: ; preds = %while.cond
+ %1 = load i32, ptr %arg.addr, align 4, !dbg !39
+ %dec = add nsw i32 %1, -1, !dbg !39
+ store i32 %dec, ptr %arg.addr, align 4, !dbg !39
+ br label %while.cond, !dbg !37
+
+while.end: ; preds = %while.cond
+ ret i32 0, !dbg !42
+}
+
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!21}
!18 = !{!1, !6}
@@ -62,3 +106,10 @@ entry:
!20 = !{}
!21 = !{i32 1, !"Debug Info Version", i32 3}
!22 = !DILocation(line: 0, column: 0, scope: !17)
+!30 = distinct !DISubprogram(name: "bar", scope: !2, file: !2, line: 10, type: !3, scopeLine: 11, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, I've left some nits and one more meaningful question.
Behaviourally, this SGTM, but I think it would be great if the proposed behaviour also got a quick look from an LLDB-person (@adrian-prantl / @JDevlieghere ) and/or @dwblaikie, as I could imagine debuggers having special-casing for idiosyncratic compiler output for this kind of thing.
// Was there an empty-line location? Return that, and it'll have the | ||
// scope-line "flow" into it when it becomes the prologue_end position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be slightly more clear, something like: "If we found a line-zero location, use that. The prologue_end entry always gets assigned the scope line."
Wait, but maybe I'm wrong - if IsEmptyPrologue
is true then wouldn't the line zero get used (I'm looking in emitInitialLocDirective
- non-null line-zero PrologEndLoc
and with IsEmptyPrologue
true)? Is this case handled correctly by your patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Hoisting to outer discussion as github has lost this review comment in the diff-display),
(Taking a stab at usernames) CC @tedwoodward from the conference, I believe I mentioned this patch as a revising-the-placement-of-prologue_end thing that was coming along |
(Hoisting to here as github has lost the inner comments), The interaction with linezero prologue_ends is interesting -- I don't think there's anything useful that comes from putting prologue_end on line zero as it doesn't communicate anything further about the function: the frame is setup, but there's no particular line you've stopped at. I think the code selecting for it was part of the previous strategy of "Hang prologue_end onto another line-table entry". With the revision I've removed the prioritisation of line-zero locations over the "first meaningful instruction" location, which independently makes sense IMO. I've also added a filter to not set prologue_end on the first meaningful instruction if it's also line-zero: it won't communicate anything useful, and giving it a real line-number would be misleading. The outcome is then no prologue_end flag being set, which is the most accurate outcome IMO in these situations. In terms of test coverage this means |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another inline question, just to clear up some potential confusion on my part. Otherwise LGTM & I'm happy to Accept once that's resolved.
; CHECK: Lfunc_begin0: | ||
; CHECK-NEXT: .file{{.+}} | ||
; CHECK-NEXT: .loc 1 1 0 ## test-small.c:1:0{{$}} | ||
; CHECK-NEXT: .cfi_startproc | ||
; CHECK-NEXT: ## %bb.{{[0-9]+}}: | ||
; CHECK-NEXT: .loc 1 0 1 prologue_end{{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thinking out loud)
I think this test change is fine. AFAICT the original purpose of this test is to check that prologue_end doesn't get added to the first instructions. More than checking that it gets added to the final instruction, or added at all. With that line of reasoning this test update doesn't invalidate the original purpose of the test.
cc @rastogishubham as the author of this test
// place prologue_end. | ||
if (PrologEndLoc) { | ||
const DebugLoc &DL = PrologEndLoc->getDebugLoc(); | ||
if (!DL || DL->getLine() != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel slightly confused reading the comment and the current if
expression. I read the comment as saying "... Thus, only early-return with this location if it's a valid location", but !DL
is not a location we can use?
Should this be if (DL && DL->getLine() != 0)
instead of if (!DL || DL->getLine() != 0)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not clear that the real purpose is simply filtering out line-zero-locations -- those with an empty DebugLoc get given the scope line due to an earlier patch. (It's unfortunate that we've got "empty" nullptr DebugLocs, but also "empty" source locations in the form of line-zero, but here we are).
Without this awkward logic we would end up with zero-length linetable entries for the scope line, before then having a prologue_end linetable entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, yep I see that now, thanks. This all feels a bit tangled up in an unfortunate way, but I've not got any immediate ideas (possibly there's some way of refactoring this, but maybe not), so LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// place prologue_end. | ||
if (PrologEndLoc) { | ||
const DebugLoc &DL = PrologEndLoc->getDebugLoc(); | ||
if (!DL || DL->getLine() != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, yep I see that now, thanks. This all feels a bit tangled up in an unfortunate way, but I've not got any immediate ideas (possibly there's some way of refactoring this, but maybe not), so LGTM.
Conflicts: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/test/DebugInfo/MIR/X86/empty-inline.mir
Pushed up a merge to master to resolve conflicts:
I believe these are all obvious fixes, will leave the patch here for a few hours to use the CI befor emerging. |
CI-checking believes there's something wrong on Linux-only; check-all on my Linux machine doesn't find anything wrong, and CI isn't indicating what test might be failing. Thus: I shall submit and see what happens >_> |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/10296 Here is the relevant piece of the build log for the reference
|
Looks like this is breaking the
Could you please take a look? |
Same on Arm and AArch64 - https://lab.llvm.org/buildbot/#/builders/59/builds/8032. We expect to be stopped at line 0 The test program is https://github.com/llvm/llvm-project/blob/main/lldb/test/API/source-manager/artificial_location.c. Note the manual |
…inputs (llvm#107849)" This reverts commit bf483dd.
(Ah, I see the buildkite had another 8Mb of test outputs that it wasn't showing me, guh), Thanks for the notice; I feel like this is exactly the intended outcome of this change, i.e. ensuring that we produce a steppable location at the start of degenerate functions (including those with #line0 !). That being said, I can't replicate this perfectly locally, I'll work on that, more data in an hour or so. I can revert if that's the easiest; it's sort of my intention that that test will become redundant though, as the compiler Should (TM) be able to describe the behaviour that LLDB is emulating in the linetable most of the time. |
Ooooo, and there's a duplicate line-entry added as well, that's a good enough reason to revert too; sorry for the bother. |
For when I get back to this tomorrow -- is the intention of that test (artificial_location in TestSourceManager) specifically about stepping into prologues + the start of a function, or is it about stimulating LLDBs handling of line-zero locations? I can almost certainly cook up something that stops on line-zero somehow (currently thinking a crash). |
It's the latter (the test checks lldb's output when it stops on line zero), and it just so happens that I have a patch (#115876) up which modifies that test. I wasn't able to get it to do what I wanted using the approach in that test, so I implemented something hopefully more robust (it explicitly finds an instruction with line==0 and sets a breakpoint there). I think that should be compatible your changes here. |
This patch introduces an undefined behavior in clang:
|
In 39b2979 Pavel has kindly refined the implementation of a test in such a way that it doesn't trip up over this patch -- the test wishes to stimulate LLDBs presentation of line0 locations, rather than wanting to always step on line-zero on entry to artificial_location.c. As that's what was tripping up this change, reapply. Original commit message follows. [DWARF] Emit a worst-case prologue_end flag for pathological inputs (llvm#107849) prologue_end usually indicates where the end of the function-initialization lies, and is where debuggers usually choose to put the initial breakpoint for a function. Our current algorithm piggy-backs it on the first available source-location: which doesn't necessarily have anything to do with the start of the function. To avoid this in heavily-optimised code that lacks many useful source locations, pick a worst-case "if all else fails" prologue_end location, of the first instruction that appears to do meaningful computation. It'll be given the function-scope line number, which should run-on from the start of the function anyway. This means if your code is completely inverted by the optimiser, you can at least put a breakpoint at the _start_ like you expect, even if it's difficult to then step through. This patch also attempts to preserve some good behaviour we have without optimisations -- at O0, if the prologue immediately falls into a loop body without any computation happening, then prologue_end lands at the start of that loop. This is desirable; but does mean we need to do more work to detect and support those situations.
Rebased patch passes the online CI, and a ubsan stage2 build of clang completed without incident (assuming I've built it correctly), and didn't fire on the given reproducer. Possibly it's a case of mistaken commit identity, or composition with something else that landed at the time? I'll re-land as the reported issues seem clean to me -- @hokein please do let me know if this reoccurs. |
In 39b2979 Pavel has kindly refined the implementation of a test in such a way that it doesn't trip up over this patch -- the test wishes to stimulate LLDBs presentation of line0 locations, rather than wanting to always step on line-zero on entry to artificial_location.c. As that's what was tripping up this change, reapply. Original commit message follows. [DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849) prologue_end usually indicates where the end of the function-initialization lies, and is where debuggers usually choose to put the initial breakpoint for a function. Our current algorithm piggy-backs it on the first available source-location: which doesn't necessarily have anything to do with the start of the function. To avoid this in heavily-optimised code that lacks many useful source locations, pick a worst-case "if all else fails" prologue_end location, of the first instruction that appears to do meaningful computation. It'll be given the function-scope line number, which should run-on from the start of the function anyway. This means if your code is completely inverted by the optimiser, you can at least put a breakpoint at the _start_ like you expect, even if it's difficult to then step through. This patch also attempts to preserve some good behaviour we have without optimisations -- at O0, if the prologue immediately falls into a loop body without any computation happening, then prologue_end lands at the start of that loop. This is desirable; but does mean we need to do more work to detect and support those situations.
Unfortunately, usan-built clang didn't give a stacktrace. I see a normal assertion-enabled clang (built on bf483dd) also crashes, stacktrace:
|
I've managed to replicate it locally now (I must have been doing something wrong before), should be an easy fix once I've pinned it down, hopefully imminently. |
Looks like this is a proper problem when there's no body to a function (not something I'd considered possible) -- I'll add a filter to |
We see this crash downstream as well. Is a fix far away? |
Hi,
(Maybe this should be in an RFC if it's properly contentious,)
Problem: sometimes the entry blocks of functions contain no source-locations at all. Currently, prologue_end is attached to the first source-location found when iterating over the function, which don't have to be in the entry block or anywhere near it. It can even be the return instruction. IMO: this isn't beneficial and can be misleading, the spec says it's "where execution should be suspended for a breakpoint at the entry of a function". If the prologue_end point is past meaningful computation, or past the entry block, the developer has probably missed something useful.
These situations do pop up occasionally: I've got a few game code samples that have been heavily LTO'd, and a non-trivial number of functions end up with a) a lot of information loss after inlining and de-duplication, and b) a prologue_end set after various memory operations and conditional branches. Having to debug LTO'd binaries is already a trial, making it harder to set effective breakpoints doesn't help the developer.
Thus, I'd like to merge this patch which, if there aren't any source locations in the entry block, picks the first non-trivial instruction to apply the scope-line and prologue_end to as a "worst case" or "backup" entry point. See the added dbg-prolog-end-backup-loc.ll for an illustrated example.
However, nothing is ever that simple and I've found myself in the tar pit here, with numerous existing behaviours to support:
In terms of test effects:
A test that can't be fixed right now is CodeGen/Thumb2/pr52817.ll, where we put prologue_end half way through substantive computation. This is because it looks a lot like an unoptimized piece of code where prologue_end should live on the first source-location after falling-out of the entry block.
I fed this to the GDB test suite; there weren't any further regressions past the ~121 test failures I had before this patch.