Skip to content

[DWARF] Emit a worst-case prologue_end flag for pathological inputs #107849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 12, 2024

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Sep 9, 2024

Hi,

(Maybe this should be in an RFC if it's properly contentious,)

Problem: sometimes the entry blocks of functions contain no source-locations at all. Currently, prologue_end is attached to the first source-location found when iterating over the function, which don't have to be in the entry block or anywhere near it. It can even be the return instruction. IMO: this isn't beneficial and can be misleading, the spec says it's "where execution should be suspended for a breakpoint at the entry of a function". If the prologue_end point is past meaningful computation, or past the entry block, the developer has probably missed something useful.

These situations do pop up occasionally: I've got a few game code samples that have been heavily LTO'd, and a non-trivial number of functions end up with a) a lot of information loss after inlining and de-duplication, and b) a prologue_end set after various memory operations and conditional branches. Having to debug LTO'd binaries is already a trial, making it harder to set effective breakpoints doesn't help the developer.

Thus, I'd like to merge this patch which, if there aren't any source locations in the entry block, picks the first non-trivial instruction to apply the scope-line and prologue_end to as a "worst case" or "backup" entry point. See the added dbg-prolog-end-backup-loc.ll for an illustrated example.

However, nothing is ever that simple and I've found myself in the tar pit here, with numerous existing behaviours to support:

  • In un-optimised code it's desirable for prologue_end to fall-through into a non-entry block see the additions to DebugInfo/X86/dbg-prolog-end.ll
  • In some circumstances there are no non-frame-setup instructions in the entry block anyway, in which case I fall back to not emitting a prologue_end.

In terms of test effects:

  • pseudo_cmov_lower2.ll grows a prologue_end at the start of this otherwise location-less function.
  • empty-inline.mir shifts prologue_end upwards into the entry block, before control-flow happens, which is a desirable move IMO.
  • empty-line-info.ll grows prologue_ends on the scope-line linetable entries -- these are entries we add for functions with no source locations anyway.

A test that can't be fixed right now is CodeGen/Thumb2/pr52817.ll, where we put prologue_end half way through substantive computation. This is because it looks a lot like an unoptimized piece of code where prologue_end should live on the first source-location after falling-out of the entry block.

I fed this to the GDB test suite; there weren't any further regressions past the ~121 test failures I had before this patch.

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.
@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

Hi,

(Maybe this should be in an RFC if it's properly contentious,)

Problem: sometimes the entry blocks of functions contain no source-locations at all. Currently, prologue_end is attached to the first source-location found when iterating over the function, which don't have to be in the entry block or anywhere near it. It can even be the return instruction. IMO: this isn't beneficial and can be misleading, the spec says it's "where execution should be suspended for a breakpoint at the entry of a function". If the prologue_end point is past meaningful computation, or past the entry block, the developer has probably missed something useful.

These situations do pop up occasionally: I've got a few game code samples that have been heavily LTO'd, and a non-trivial number of functions end up with a) a lot of information loss after inlining and de-duplication, and b) a prologue_end set after various memory operations and conditional branches. Having to debug LTO'd binaries is already a trial, making it harder to set effective breakpoints doesn't help the developer.

Thus, I'd like to merge this patch which, if there aren't any source locations in the entry block, picks the first non-trivial instruction to apply the scope-line and prologue_end to as a "worst case" or "backup" entry point. See the added dbg-prolog-end-backup-loc.ll for an illustrated example.

However, nothing is ever that simple and I've found myself in the tar pit here, with numerous existing behaviours to support:

  • In un-optimised code it's desirable for prologue_end to fall-through into a non-entry block see the additions to DebugInfo/X86/dbg-prolog-end.ll
  • In some circumstances there are no non-frame-setup instructions in the entry block anyway, in which case I fall back to not emitting a prologue_end.

In terms of test effects:

  • pseudo_cmov_lower2.ll grows a prologue_end at the start of this otherwise location-less function.
  • empty-inline.mir shifts prologue_end upwards into the entry block, before control-flow happens, which is a desirable move IMO.
  • empty-line-info.ll grows prologue_ends on the scope-line linetable entries -- these are entries we add for functions with no source locations anyway.

A test that can't be fixed right now is CodeGen/Thumb2/pr52817.ll, where we put prologue_end half way through substantive computation. This is because it looks a lot like an unoptimized piece of code where prologue_end should live on the first source-location after falling-out of the entry block.

I fed this to the GDB test suite; there weren't any further regressions past the ~121 test failures I had before this patch.


Patch is 21.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107849.diff

8 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+114-19)
  • (modified) llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll (+2-2)
  • (added) llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir (+134)
  • (modified) llvm/test/DebugInfo/MIR/X86/empty-inline.mir (+3-2)
  • (added) llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll (+86)
  • (modified) llvm/test/DebugInfo/X86/dbg-prolog-end.ll (+51)
  • (modified) llvm/test/DebugInfo/X86/empty-line-info.ll (+5-2)
  • (modified) llvm/test/DebugInfo/X86/loop-align-debug.ll (+1)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 148b620c2b62b7..6830fd41b8ccce 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2061,6 +2061,16 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
   unsigned LastAsmLine =
       Asm->OutStreamer->getContext().getCurrentDwarfLoc().getLine();
 
+  if (!DL && MI == PrologEndLoc) {
+    // In rare situations, we might want to place the end of the prologue
+    // somewhere that doesn't have a source location already. It should be in
+    // the entry block.
+    assert(MI->getParent() == &*MI->getMF()->begin());
+    recordSourceLine(SP->getScopeLine(), 0, SP,
+                     DWARF2_FLAG_PROLOGUE_END | DWARF2_FLAG_IS_STMT);
+    return;
+  }
+
   bool PrevInstInDiffBB = PrevInstBB && PrevInstBB != MI->getParent();
   if (DL == PrevInstLoc && !PrevInstInDiffBB) {
     // If we have an ongoing unspecified location, nothing to do here.
@@ -2131,36 +2141,121 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
     PrevInstLoc = DL;
 }
 
-static std::pair<const MachineInstr *, bool>
+std::pair<const MachineInstr *, bool>
 findPrologueEndLoc(const MachineFunction *MF) {
   // First known non-DBG_VALUE and non-frame setup location marks
   // the beginning of the function body.
-  const MachineInstr *LineZeroLoc = nullptr;
+  const auto &TII = *MF->getSubtarget().getInstrInfo();
+  const MachineInstr *LineZeroLoc = nullptr, *NonTrivialInst = nullptr;
   const Function &F = MF->getFunction();
 
   // Some instructions may be inserted into prologue after this function. Must
   // keep prologue for these cases.
   bool IsEmptyPrologue =
       !(F.hasPrologueData() || F.getMetadata(LLVMContext::MD_func_sanitize));
-  for (const auto &MBB : *MF) {
-    for (const auto &MI : MBB) {
-      if (!MI.isMetaInstruction()) {
-        if (!MI.getFlag(MachineInstr::FrameSetup) && MI.getDebugLoc()) {
-          // Scan forward to try to find a non-zero line number. The
-          // prologue_end marks the first breakpoint in the function after the
-          // frame setup, and a compiler-generated line 0 location is not a
-          // meaningful breakpoint. If none is found, return the first
-          // location after the frame setup.
-          if (MI.getDebugLoc().getLine())
-            return std::make_pair(&MI, IsEmptyPrologue);
-
-          LineZeroLoc = &MI;
-        }
-        IsEmptyPrologue = false;
-      }
+
+  // Helper lambda to examine each instruction and potentially return it
+  // as the prologue_end point.
+  auto ExamineInst = [&](const MachineInstr &MI)
+      -> std::optional<std::pair<const MachineInstr *, bool>> {
+    // Is this instruction trivial data shuffling or frame-setup?
+    bool isCopy = (TII.isCopyInstr(MI) ? true : false);
+    bool isTrivRemat = TII.isTriviallyReMaterializable(MI);
+    bool isFrameSetup = MI.getFlag(MachineInstr::FrameSetup);
+
+    if (!isFrameSetup && MI.getDebugLoc()) {
+      // Scan forward to try to find a non-zero line number. The
+      // prologue_end marks the first breakpoint in the function after the
+      // frame setup, and a compiler-generated line 0 location is not a
+      // meaningful breakpoint. If none is found, return the first
+      // location after the frame setup.
+      if (MI.getDebugLoc().getLine())
+        return std::make_pair(&MI, IsEmptyPrologue);
+
+      LineZeroLoc = &MI;
+    }
+
+    // Keep track of the first "non-trivial" instruction seen, i.e. anything
+    // that doesn't involve shuffling data around or is a frame-setup.
+    if (!isCopy && !isTrivRemat && !isFrameSetup && !NonTrivialInst)
+      NonTrivialInst = &MI;
+
+    IsEmptyPrologue = false;
+    return std::nullopt;
+  };
+
+  // Examine all the instructions at the start of the function. This doesn't
+  // necessarily mean just the entry block: unoptimised code can fall-through
+  // into an initial loop, and it makes sense to put the initial breakpoint on
+  // the first instruction of such a loop. However, if we pass branches, we're
+  // better off synthesising an early prologue_end.
+  auto CurBlock = MF->begin();
+  auto CurInst = CurBlock->begin();
+  while (true) {
+    // Skip empty blocks, in rare cases the entry can be empty.
+    if (CurInst == CurBlock->end()) {
+      ++CurBlock;
+      CurInst = CurBlock->begin();
+      continue;
+    }
+
+    // Check whether this non-meta instruction a good position for prologue_end.
+    if (!CurInst->isMetaInstruction()) {
+      auto FoundInst = ExamineInst(*CurInst);
+      if (FoundInst)
+        return *FoundInst;
+    }
+
+    // Try to continue searching, but use a backup-location if substantive
+    // computation is happening.
+    auto NextInst = std::next(CurInst);
+    if (NextInst == CurInst->getParent()->end()) {
+      // We've reached the end of the block. Did we just look at a terminator?
+      if (CurInst->isTerminator())
+        // Some kind of "real" control flow is occurring. At the very least
+        // we would have to start exploring the CFG, a good signal that the
+        // prologue is over.
+        break;
+
+      // If we've already fallen through into a loop, don't fall through
+      // further, use a backup-location.
+      if (CurBlock->pred_size() > 1)
+        break;
+
+      // Fall-through from entry to the next block. This is common at -O0 when
+      // there's no initialisation in the function.
+      auto NextBBIter = std::next(CurInst->getParent()->getIterator());
+      // Bail if we're also at the end of the function.
+      if (NextBBIter == MF->end())
+        break;
+      CurBlock = NextBBIter;
+      CurInst = NextBBIter->begin();
+    } else {
+      // Continue examining the current block.
+      CurInst = NextInst;
     }
   }
-  return std::make_pair(LineZeroLoc, IsEmptyPrologue);
+
+  // We didn't find a "good" prologue_end, so consider backup locations.
+  // Was there an empty-line location? Return that, and it'll have the
+  // scope-line "flow" into it when it becomes the prologue_end position.
+  if (LineZeroLoc)
+    return std::make_pair(LineZeroLoc, IsEmptyPrologue);
+
+  // We couldn't find any source-location, suggesting all meaningful information
+  // got optimised away. Set the prologue_end to be the first non-trivial
+  // instruction, which will get the scope line number. This is better than
+  // nothing.
+  // Only do this in the entry block, as we'll be giving it the scope line for
+  // the function. Return IsEmptyPrologue==true if we've picked the first
+  // instruction.
+  if (NonTrivialInst && NonTrivialInst->getParent() == &*MF->begin()) {
+    IsEmptyPrologue = NonTrivialInst == &*MF->begin()->begin();
+    return std::make_pair(NonTrivialInst, IsEmptyPrologue);
+  }
+
+  // If the entry path is empty, just don't have a prologue_end at all.
+  return std::make_pair(nullptr, IsEmptyPrologue);
 }
 
 /// Register a source line with debug info. Returns the  unique label that was
diff --git a/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll b/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
index 19253d67c14945..63640747d2aa2c 100644
--- a/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
+++ b/llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
@@ -203,9 +203,9 @@ declare void @llvm.dbg.value(metadata, metadata, metadata)
 ; locations in the function.
 define double @foo1_g(float %p1, double %p2, double %p3) nounwind !dbg !4 {
 ; CHECK-LABEL: foo1_g:
-; CHECK:       .file   1 "." "test.c"
-; CHECK-NEXT:  .loc    1 3 0
 ; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    .file   1 "." "test.c"
+; CHECK-NEXT:    .loc 1 3 0 prologue_end
 ; CHECK-NEXT:    xorps %xmm3, %xmm3
 ; CHECK-NEXT:    ucomiss %xmm3, %xmm0
 ; CHECK-NEXT:    movsd {{.*#+}} xmm0 = [1.25E+0,0.0E+0]
diff --git a/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir b/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
new file mode 100644
index 00000000000000..378d579cee62e1
--- /dev/null
+++ b/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
@@ -0,0 +1,134 @@
+# RUN: llc %s -start-before=livedebugvalues -o - | \
+# RUN:     FileCheck %s --implicit-check-not=prologue_end
+#
+## When picking a "backup" location of the first non-trivial instruction in
+## a function, don't select a location outside of the entry block. We have to
+## give it the functions scope-line, and installing that outside of the entry
+## block is liable to be misleading.
+##
+## Produced from the C below with "clang -O2 -g -mllvm
+## -stop-before=livedebugvalues", then modified to unrotate and shift early
+## insts into the loop block. This means the MIR is meaningless, we only test
+## whether the scope-line will leak into the loop block or not.
+##
+## int glob = 0;
+## int foo(int arg, int sum) {
+##   arg += sum;
+##   while (arg) {
+##     glob--;
+##     arg %= glob;
+##   }
+##   return 0;
+## }
+#
+# CHECK-LABEL: foo:
+# CHECK:        .loc    0 2 0
+# CHECK:        # %bb.0:
+# CHECK-NEXT:   movl    %edi, %edx
+# CHECK-NEXT:   .loc    0 0 0 is_stmt 0
+# CHECK-NEXT:   .Ltmp0:
+# CHECK-NEXT:   .p2align        4, 0x90
+# CHECK-NEXT:   .LBB0_1:
+# CHECK-LABEL:  addl    %esi, %edx
+
+ 
+--- |
+  ; ModuleID = 'out2.ll'
+  source_filename = "foo.c"
+  target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+  target triple = "x86_64-unknown-linux-gnu"
+  
+  @glob = dso_local local_unnamed_addr global i32 0, align 4, !dbg !0
+  
+  define dso_local noundef i32 @foo(i32 noundef %arg, i32 noundef %sum) local_unnamed_addr !dbg !9 {
+  entry:
+    %add = add nsw i32 %sum, %arg
+    br label %while.body.preheader
+  
+  while.body.preheader:                             ; preds = %entry
+    %glob.promoted = load i32, ptr @glob, align 4
+    br label %while.body, !dbg !13
+  
+  while.body:                                       ; preds = %while.body, %while.body.preheader
+    %arg.addr.06 = phi i32 [ %rem, %while.body ], [ %add, %while.body.preheader ]
+    %dec35 = phi i32 [ %dec, %while.body ], [ %glob.promoted, %while.body.preheader ]
+    %dec = add nsw i32 %dec35, -1, !dbg !14
+    %0 = add i32 %dec35, -1, !dbg !16
+    %rem = srem i32 %arg.addr.06, %0, !dbg !16
+    %tobool.not = icmp eq i32 %rem, 0, !dbg !13
+    br i1 %tobool.not, label %while.cond.while.end_crit_edge, label %while.body, !dbg !13
+  
+  while.cond.while.end_crit_edge:                   ; preds = %while.body
+    store i32 %dec, ptr @glob, align 4, !dbg !14
+    br label %while.end, !dbg !13
+  
+  while.end:                                        ; preds = %while.cond.while.end_crit_edge
+    ret i32 0, !dbg !17
+  }
+  
+  !llvm.dbg.cu = !{!2}
+  !llvm.module.flags = !{!6, !7}
+  !llvm.ident = !{!8}
+  
+  !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+  !1 = distinct !DIGlobalVariable(name: "glob", scope: !2, file: !3, line: 1, type: !5, isLocal: false, isDefinition: true)
+  !2 = distinct !DICompileUnit(language: DW_LANG_C11, file: !3, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+  !3 = !DIFile(filename: "foo.c", directory: "")
+  !4 = !{!0}
+  !5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+  !6 = !{i32 7, !"Dwarf Version", i32 5}
+  !7 = !{i32 2, !"Debug Info Version", i32 3}
+  !8 = !{!"clang"}
+  !9 = distinct !DISubprogram(name: "foo", scope: !3, file: !3, line: 2, type: !10, scopeLine: 2, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !12)
+  !10 = !DISubroutineType(types: !11)
+  !11 = !{!5, !5, !5}
+  !12 = !{}
+  !13 = !DILocation(line: 4, column: 3, scope: !9)
+  !14 = !DILocation(line: 5, column: 9, scope: !15)
+  !15 = distinct !DILexicalBlock(scope: !9, file: !3, line: 4, column: 15)
+  !16 = !DILocation(line: 6, column: 9, scope: !15)
+  !17 = !DILocation(line: 8, column: 3, scope: !9)
+
+...
+---
+name:            foo
+alignment:       16
+tracksRegLiveness: true
+debugInstrRef:   true
+tracksDebugUserValues: true
+liveins:
+  - { reg: '$edi' }
+  - { reg: '$esi' }
+frameInfo:
+  maxAlignment:    1
+  maxCallFrameSize: 0
+  isCalleeSavedInfoValid: true
+machineFunctionInfo:
+  amxProgModel:    None
+body:             |
+  bb.0.entry:
+    liveins: $edi, $esi
+  
+    $edx = MOV32rr $edi
+  
+  bb.1.while.body (align 16):
+    successors: %bb.2(0x04000000), %bb.1(0x7c000000)
+    liveins: $ecx, $edx, $esi
+
+    renamable $edx = nsw ADD32rr killed renamable $edx, renamable $esi, implicit-def dead $eflags
+    renamable $ecx = MOV32rm $rip, 1, $noreg, @glob, $noreg :: (dereferenceable load (s32) from @glob)
+    renamable $ecx = DEC32r killed renamable $ecx, implicit-def dead $eflags
+    $eax = MOV32rr killed $edx
+    CDQ implicit-def $eax, implicit-def $edx, implicit $eax
+    IDIV32r renamable $ecx, implicit-def dead $eax, implicit-def $edx, implicit-def dead $eflags, implicit $eax, implicit $edx
+    TEST32rr renamable $edx, renamable $edx, implicit-def $eflags
+    JCC_1 %bb.1, 5, implicit killed $eflags
+  
+  bb.2.while.cond.while.end_crit_edge:
+    liveins: $ecx, $esi
+  
+    MOV32mr $rip, 1, $noreg, @glob, $noreg, killed renamable $ecx, debug-location !14 :: (store (s32) into @glob)
+    $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, debug-location !17
+    RET64 $eax, debug-location !17
+
+...
diff --git a/llvm/test/DebugInfo/MIR/X86/empty-inline.mir b/llvm/test/DebugInfo/MIR/X86/empty-inline.mir
index 58775e8cd852fb..c489339b4c8aa6 100644
--- a/llvm/test/DebugInfo/MIR/X86/empty-inline.mir
+++ b/llvm/test/DebugInfo/MIR/X86/empty-inline.mir
@@ -13,11 +13,12 @@
 #
 # CHECK: Address            Line   Column File   ISA Discriminator OpIndex Flags
 # CHECK-NEXT:                ---
-# CHECK-NEXT:                 25      0      1   0             0         0 is_stmt
+# CHECK-NEXT:                 25      0      1   0             0         0 is_stmt prologue_end
 # CHECK-NEXT:                  0      0      1   0             0         0
-# CHECK-NEXT:                 29     28      1   0             0         0 is_stmt prologue_end
+# CHECK-NEXT:                 29     28      1   0             0         0 is_stmt
 # CHECK-NEXT:                 29     28      1   0             0         0 is_stmt
 # CHECK-NEXT:                 29     28      1   0             0         0 is_stmt end_sequence
+
 --- |
   source_filename = "t.ll"
   target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
diff --git a/llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll b/llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll
new file mode 100644
index 00000000000000..2425df608275c5
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/dbg-prolog-end-backup-loc.ll
@@ -0,0 +1,86 @@
+; RUN: llc %s -o - | FileCheck %s
+
+;; This test has had source-locations removed from the prologue, to simulate
+;; heavily-optimised scenarios where a lot of debug-info gets dropped. Check
+;; that we can pick a "worst-case" prologue_end position, of the first
+;; instruction that does any meaningful computation (the add). It's better to
+;; put the prologue_end flag here rather than deeper into the loop, past the
+;; early-exit check.
+;;
+;; Generated from this code at -O2 -g in clang, with source locations then
+;; deleted.
+;;
+;; int glob = 0;
+;; int foo(int arg, int sum) {
+;;   arg += sum;
+;;   while (arg) {
+;;     glob--;
+;;     arg %= glob;
+;;   }
+;;   return 0;
+;; }
+
+; CHECK-LABEL: foo:
+;; Scope-line location:
+; CHECK:       .loc    0 2 0
+;; Entry block:
+; CHECK:        movl    %edi, %edx
+; CHECK-NEXT:   .loc    0 2 0 prologue_end
+; CHECK-NEXT:   addl    %esi, %edx
+; CHECK-NEXT:   je      .LBB0_4
+; CHECK-LABEL: # %bb.1:
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+@glob = dso_local local_unnamed_addr global i32 0, align 4, !dbg !0
+
+define dso_local noundef i32 @foo(i32 noundef %arg, i32 noundef %sum) local_unnamed_addr !dbg !9 {
+entry:
+  %add = add nsw i32 %sum, %arg
+  %tobool.not4 = icmp eq i32 %add, 0
+  br i1 %tobool.not4, label %while.end, label %while.body.preheader
+
+while.body.preheader:
+  %glob.promoted = load i32, ptr @glob, align 4
+  br label %while.body, !dbg !14
+
+while.body:
+  %arg.addr.06 = phi i32 [ %rem, %while.body ], [ %add, %while.body.preheader ]
+  %dec35 = phi i32 [ %dec, %while.body ], [ %glob.promoted, %while.body.preheader ]
+  %dec = add nsw i32 %dec35, -1, !dbg !15
+  %rem = srem i32 %arg.addr.06, %dec, !dbg !17
+  %tobool.not = icmp eq i32 %rem, 0, !dbg !14
+  br i1 %tobool.not, label %while.cond.while.end_crit_edge, label %while.body, !dbg !14
+
+while.cond.while.end_crit_edge:
+  store i32 %dec, ptr @glob, align 4, !dbg !15
+  br label %while.end, !dbg !14
+
+while.end:
+  ret i32 0, !dbg !18
+}
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!6, !7}
+!llvm.ident = !{!8}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "glob", scope: !2, file: !3, line: 1, type: !5, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C11, file: !3, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "foo.c", directory: "")
+!4 = !{!0}
+!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!6 = !{i32 7, !"Dwarf Version", i32 5}
+!7 = !{i32 2, !"Debug Info Version", i32 3}
+!8 = !{!"clang"}
+!9 = distinct !DISubprogram(name: "foo", scope: !3, file: !3, line: 2, type: !10, scopeLine: 2, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !12)
+!10 = !DISubroutineType(types: !11)
+!11 = !{!5, !5, !5}
+!12 = !{}
+!13 = !DILocation(line: 3, column: 7, scope: !9)
+!14 = !DILocation(line: 4, column: 3, scope: !9)
+!15 = !DILocation(line: 5, column: 9, scope: !16)
+!16 = distinct !DILexicalBlock(scope: !9, file: !3, line: 4, column: 15)
+!17 = !DILocation(line: 6, column: 9, scope: !16)
+!18 = !DILocation(line: 8, column: 3, scope: !9)
diff --git a/llvm/test/DebugInfo/X86/dbg-prolog-end.ll b/llvm/test/DebugInfo/X86/dbg-prolog-end.ll
index 1703323fc7ee1d..3d29f8e266301f 100644
--- a/llvm/test/DebugInfo/X86/dbg-prolog-end.ll
+++ b/llvm/test/DebugInfo/X86/dbg-prolog-end.ll
@@ -36,6 +36,50 @@ entry:
   ret i32 %call, !dbg !16
 }
 
+;; int foo(int arg) {
+;;   while (arg)
+;;    arg--;
+;;  return 0;
+;; }
+;;
+;; In this function, the entry block will fall through to while.cond, with no
+;; instructions having source-locations. The expectations at -O0 is that we'll
+;; put prologue_end on the first instruction of the loop, after %arg.addr is
+;; initialized.
+
+; CHECK:      _bar:
+; CHECK-NEXT: Lfunc_begin2:
+; CHECK-NEXT:     .loc    1 11 0 is_stmt 1
+; CHECK-NEXT:     .cfi_startproc
+; CHECK-NEXT: ## %bb.0:
+; CHECK-NEXT:     movl    %edi, -4(%rsp)
+; CHECK-NEXT: LBB2_1:
+; CHECK-NEXT:                  ## =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: Ltmp4:
+; CHECK-NEXT:     .loc    1 12 3 prologue_end
+; CHECK-NEXT:     cmpl    $0, -4(%rsp)
+
+define dso_local i32 @bar(i32 noundef %arg) !dbg !30 {
+entry:
+  %arg.addr = alloca i32, align 4
+  store i32 %arg, ptr %arg.addr, align 4
+  br label %while.cond, !dbg !37
+
+while.cond:                                       ; preds = %while.body, %entry
+  %0 = load i32, ptr %arg.addr, align 4, !dbg !38
+  %tobool = icmp ne i32 %0, 0, !dbg !37
+  br i1 %tobool, label %while.body, label %while.end, !dbg !37
+
+while.body:                                       ; preds = %while.cond
+  %1 = load i32, ptr %arg.addr, align 4, !dbg !39
+  %dec = add nsw i32 %1, -1, !dbg !39
+  store i32 %dec, ptr %arg.addr, align 4, !dbg !39
+  br label %while.cond, !dbg !37
+
+while.end:                                        ; preds = %while.cond
+  ret i32 0, !dbg !42
+}
+
 !llvm.dbg.cu = !{!0}
 !llvm.module.flags = !{!21}
 !18 = !{!1, !6}
@@ -62,3 +106,10 @@ entry:
 !20 = !{}
 !21 = !{i32 1, !"Debug Info Version", i32 3}
 !22 = !DILocation(line: 0, column: 0, scope: !17)
+!30 = distinct !DISubprogram(name: "bar", scope: !2, file: !2, line: 10, type: !3, scopeLine: 11, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, ...
[truncated]

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, I've left some nits and one more meaningful question.

Behaviourally, this SGTM, but I think it would be great if the proposed behaviour also got a quick look from an LLDB-person (@adrian-prantl / @JDevlieghere ) and/or @dwblaikie, as I could imagine debuggers having special-casing for idiosyncratic compiler output for this kind of thing.

Comment on lines 2240 to 2241
// Was there an empty-line location? Return that, and it'll have the
// scope-line "flow" into it when it becomes the prologue_end position.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be slightly more clear, something like: "If we found a line-zero location, use that. The prologue_end entry always gets assigned the scope line."

Wait, but maybe I'm wrong - if IsEmptyPrologue is true then wouldn't the line zero get used (I'm looking in emitInitialLocDirective - non-null line-zero PrologEndLoc and with IsEmptyPrologue true)? Is this case handled correctly by your patch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Hoisting to outer discussion as github has lost this review comment in the diff-display),

@jmorse
Copy link
Member Author

jmorse commented Nov 11, 2024

(Taking a stab at usernames) CC @tedwoodward from the conference, I believe I mentioned this patch as a revising-the-placement-of-prologue_end thing that was coming along

@jmorse
Copy link
Member Author

jmorse commented Nov 11, 2024

(Hoisting to here as github has lost the inner comments),

The interaction with linezero prologue_ends is interesting -- I don't think there's anything useful that comes from putting prologue_end on line zero as it doesn't communicate anything further about the function: the frame is setup, but there's no particular line you've stopped at. I think the code selecting for it was part of the previous strategy of "Hang prologue_end onto another line-table entry".

With the revision I've removed the prioritisation of line-zero locations over the "first meaningful instruction" location, which independently makes sense IMO. I've also added a filter to not set prologue_end on the first meaningful instruction if it's also line-zero: it won't communicate anything useful, and giving it a real line-number would be misleading. The outcome is then no prologue_end flag being set, which is the most accurate outcome IMO in these situations.

In terms of test coverage this means CodeGen/X86/no-non-zero-debug-loc-prologue.ll no longer generates a prologue, which I've added an implicit-check-not to test for.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another inline question, just to clear up some potential confusion on my part. Otherwise LGTM & I'm happy to Accept once that's resolved.

; CHECK: Lfunc_begin0:
; CHECK-NEXT: .file{{.+}}
; CHECK-NEXT: .loc 1 1 0 ## test-small.c:1:0{{$}}
; CHECK-NEXT: .cfi_startproc
; CHECK-NEXT: ## %bb.{{[0-9]+}}:
; CHECK-NEXT: .loc 1 0 1 prologue_end{{.*}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(thinking out loud)
I think this test change is fine. AFAICT the original purpose of this test is to check that prologue_end doesn't get added to the first instructions. More than checking that it gets added to the final instruction, or added at all. With that line of reasoning this test update doesn't invalidate the original purpose of the test.

cc @rastogishubham as the author of this test

// place prologue_end.
if (PrologEndLoc) {
const DebugLoc &DL = PrologEndLoc->getDebugLoc();
if (!DL || DL->getLine() != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel slightly confused reading the comment and the current if expression. I read the comment as saying "... Thus, only early-return with this location if it's a valid location", but !DL is not a location we can use?

Should this be if (DL && DL->getLine() != 0) instead of if (!DL || DL->getLine() != 0)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not clear that the real purpose is simply filtering out line-zero-locations -- those with an empty DebugLoc get given the scope line due to an earlier patch. (It's unfortunate that we've got "empty" nullptr DebugLocs, but also "empty" source locations in the form of line-zero, but here we are).

Without this awkward logic we would end up with zero-length linetable entries for the scope line, before then having a prologue_end linetable entry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, yep I see that now, thanks. This all feels a bit tangled up in an unfortunate way, but I've not got any immediate ideas (possibly there's some way of refactoring this, but maybe not), so LGTM.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// place prologue_end.
if (PrologEndLoc) {
const DebugLoc &DL = PrologEndLoc->getDebugLoc();
if (!DL || DL->getLine() != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, yep I see that now, thanks. This all feels a bit tangled up in an unfortunate way, but I've not got any immediate ideas (possibly there's some way of refactoring this, but maybe not), so LGTM.

Conflicts:
	llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
	llvm/test/DebugInfo/MIR/X86/empty-inline.mir
@jmorse
Copy link
Member Author

jmorse commented Nov 12, 2024

Pushed up a merge to master to resolve conflicts:

  • In DwarfDebug.cpp some extra basic-block-sections logic has landed. I've placed the logic for "put prologue_end here" in beginInstruction before the basic-block-sections logic, as I believe that's guarding logic that considers what the previous linetable entry is.
  • empty-inline.mir has received some updates in the meantime; the diff from 'main' is now that prologue_end shifts to the first instruction with the scope-line, instead of appearing after some loads and branching. (Which is the whole point of this patch)

I believe these are all obvious fixes, will leave the patch here for a few hours to use the CI befor emerging.

@jmorse
Copy link
Member Author

jmorse commented Nov 12, 2024

CI-checking believes there's something wrong on Linux-only; check-all on my Linux machine doesn't find anything wrong, and CI isn't indicating what test might be failing. Thus: I shall submit and see what happens >_>

@jmorse jmorse merged commit bf483dd into llvm:main Nov 12, 2024
6 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 12, 2024

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/10296

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/const_static_integral_member/TestConstStaticIntegralMember.py (53 of 2703)
PASS: lldb-api :: tools/lldb-dap/disconnect/TestDAP_disconnect.py (54 of 2703)
PASS: lldb-api :: python_api/frame/TestFrames.py (55 of 2703)
PASS: lldb-api :: functionalities/thread/num_threads/TestNumThreads.py (56 of 2703)
PASS: lldb-api :: functionalities/stop-on-sharedlibrary-load/TestStopOnSharedlibraryEvents.py (57 of 2703)
PASS: lldb-api :: commands/process/launch/TestProcessLaunch.py (58 of 2703)
PASS: lldb-api :: python_api/watchpoint/watchlocation/TestSetWatchlocation.py (59 of 2703)
PASS: lldb-api :: functionalities/step-avoids-no-debug/TestStepNoDebug.py (60 of 2703)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteExitCode.py (61 of 2703)
PASS: lldb-api :: python_api/find_in_memory/TestFindRangesInMemory.py (62 of 2703)
FAIL: lldb-api :: source-manager/TestSourceManager.py (63 of 2703)
******************** TEST 'lldb-api :: source-manager/TestSourceManager.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/make --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/source-manager -p TestSourceManager.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision bf483ddb42065405e345393e022dc72357ec5a3a)
  clang revision bf483ddb42065405e345393e022dc72357ec5a3a
  llvm revision bf483ddb42065405e345393e022dc72357ec5a3a
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']
original content: #include <stdio.h>

int main(int argc, char const *argv[]) {
    printf("Hello world.\n"); // Set break point at this line.
    return 0;
}

new content: #include <stdio.h>

int main(int argc, char const *argv[]) {
    printf("Hello lldb.\n"); // Set break point at this line.
    return 0;
}

os.path.getmtime() after writing new content: 1731424476.6795385
original content: #include <stdio.h>

int main(int argc, char const *argv[]) {
    printf("Hello world.\n"); // Set break point at this line.
    return 0;
}

new content: #include <stdio.h>

int main(int argc, char const *argv[]) {

@Michael137
Copy link
Member

Looks like this is breaking the source-manager/TestSourceManager.py LLDB test on macOS: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/15127/execution/node/97/log/

======================================================================
FAIL: test_artificial_source_location (TestSourceManager.SourceManagerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/source-manager/TestSourceManager.py", line 325, in test_artificial_source_location
    self.expect(
  File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2475, in expect
    self.fail(log_msg)
AssertionError: Ran command:
"process status"

Got output:
Process 99092 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x000000010402bf8c a.out`main at artificial_location.c:3
   1   	int foo() { return 42; }
   2   	
-> 3   	int main() {
   4   	#line 0
   5   	  return foo();
   6   	}

Expecting sub string: "stop reason = breakpoint" (was found)
Expecting sub string: "artificial_location.c:0" (was not found)
Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
----------------------------------------------------------------------

Could you please take a look?

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Nov 12, 2024

Same on Arm and AArch64 - https://lab.llvm.org/buildbot/#/builders/59/builds/8032.

We expect to be stopped at line 0 artificial_location.c:0 but are instead at line 3 artificial_location.c:3. Which could be the intended change of this patch, not sure at a glance.

The test program is https://github.com/llvm/llvm-project/blob/main/lldb/test/API/source-manager/artificial_location.c. Note the manual #line 0. Added by @medismailben in edf410e.

Michael137 added a commit to Michael137/llvm-project that referenced this pull request Nov 12, 2024
@jmorse
Copy link
Member Author

jmorse commented Nov 12, 2024

(Ah, I see the buildkite had another 8Mb of test outputs that it wasn't showing me, guh),

Thanks for the notice; I feel like this is exactly the intended outcome of this change, i.e. ensuring that we produce a steppable location at the start of degenerate functions (including those with #line0 !). That being said, I can't replicate this perfectly locally, I'll work on that, more data in an hour or so.

I can revert if that's the easiest; it's sort of my intention that that test will become redundant though, as the compiler Should (TM) be able to describe the behaviour that LLDB is emulating in the linetable most of the time.

@jmorse
Copy link
Member Author

jmorse commented Nov 12, 2024

Ooooo, and there's a duplicate line-entry added as well, that's a good enough reason to revert too; sorry for the bother.

jmorse added a commit that referenced this pull request Nov 12, 2024
…inputs (#107849)"

This reverts commit bf483dd.

See PR, there's a test testing for this behaviour (possibly adaptable), and
a duplicate line entry too
@jmorse
Copy link
Member Author

jmorse commented Nov 12, 2024

For when I get back to this tomorrow -- is the intention of that test (artificial_location in TestSourceManager) specifically about stepping into prologues + the start of a function, or is it about stimulating LLDBs handling of line-zero locations? I can almost certainly cook up something that stops on line-zero somehow (currently thinking a crash).

@labath
Copy link
Collaborator

labath commented Nov 13, 2024

It's the latter (the test checks lldb's output when it stops on line zero), and it just so happens that I have a patch (#115876) up which modifies that test. I wasn't able to get it to do what I wanted using the approach in that test, so I implemented something hopefully more robust (it explicitly finds an instruction with line==0 and sets a breakpoint there). I think that should be compatible your changes here.

@hokein
Copy link
Collaborator

hokein commented Nov 13, 2024

This patch introduces an undefined behavior in clang:

$ cat /t/t.ii
int a() {}
$ ./bin/clang -O2 -g -c /t/t.ii 
/t/t.ii:1:10: warning: non-void function does not return a value [-Wreturn-type]
    1 | int a() {}
      |          ^
/workspace/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:169:12: runtime error: reference binding to null pointer of type 'const llvm::MachineInstr'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior/workspace/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:169:12 in 

jmorse added a commit to jmorse/llvm-project that referenced this pull request Nov 13, 2024
In 39b2979 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.

Original commit message follows.

[DWARF] Emit a worst-case prologue_end flag for pathological inputs (llvm#107849)

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
@jmorse
Copy link
Member Author

jmorse commented Nov 13, 2024

@labath much appreciated, I've got a rebased reapplication in #116084 where I'll use the CI to check this passes,

@hokein would there be any further context than that -- presumably UBSan can produce a stack trace? There are a lot of references to MachineInstr here, and not a lot to go on.

@jmorse
Copy link
Member Author

jmorse commented Nov 14, 2024

Rebased patch passes the online CI, and a ubsan stage2 build of clang completed without incident (assuming I've built it correctly), and didn't fire on the given reproducer. Possibly it's a case of mistaken commit identity, or composition with something else that landed at the time?

I'll re-land as the reported issues seem clean to me -- @hokein please do let me know if this reoccurs.

jmorse added a commit that referenced this pull request Nov 14, 2024
In 39b2979 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.

Original commit message follows.

[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
@hokein
Copy link
Collaborator

hokein commented Nov 14, 2024

@hokein would there be any further context than that -- presumably UBSan can produce a stack trace? There are a lot of references to MachineInstr here, and not a lot to go on.

Unfortunately, usan-built clang didn't give a stacktrace. I see a normal assertion-enabled clang (built on bf483dd) also crashes, stacktrace:

$ ./bin/clang -O2 -g -c /t/t.ii                                                                                                                                                                                                                   <<<
/t/t.ii:1:10: warning: non-void function does not return a value [-Wreturn-type]
    1 | int a() {}
      |          ^
clang: /workspace/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:168: reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void, false, void>, false, true>::operator*() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void, false, void>, IsReverse = false, IsConst = true]: Assertion `!NodePtr->isKnownSentinel()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: ./bin/clang -O2 -g -c /t/t.ii
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module '/t/t.ii'.
4.      Running pass 'X86 Assembly Printer' on function '@_Z1av'
 #0 0x000055848fd8e988 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /workspace/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x000055848fd8ca90 llvm::sys::RunSignalHandlers() /workspace/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x000055848fd08326 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #3 0x000055848fd08326 CrashRecoverySignalHandler(int) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51
 #4 0x00007f09fac591a0 (/lib/x86_64-linux-gnu/libc.so.6+0x3d1a0)
 #5 0x00007f09faca70ec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f09fac59102 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #7 0x00007f09fac424f2 abort ./stdlib/abort.c:81:7
 #8 0x00007f09fac42415 _nl_load_domain ./intl/loadmsgcat.c:1177:9
 #9 0x00007f09fac51d32 (/lib/x86_64-linux-gnu/libc.so.6+0x35d32)
#10 0x0000558490a8ec98 findPrologueEndLoc(llvm::MachineFunction const*) /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:0:32
#11 0x0000558490a8ec98 llvm::DwarfDebug::emitInitialLocDirective(llvm::MachineFunction const&, unsigned int) /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2279:53
#12 0x0000558490a8f0c2 llvm::DwarfDebug::beginFunctionImpl(llvm::MachineFunction const*) /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2327:16
#13 0x0000558490a61b0e llvm::AsmPrinter::emitFunctionHeader() /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1057:37
#14 0x0000558490a645cd llvm::AsmPrinter::emitFunctionBody() /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1768:3
#15 0x000055848ebddaf1 llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) /workspace/llvm-project/llvm/lib/Target/X86/X86AsmPrinter.cpp:91:3
#16 0x000055848f3788f6 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:94:13
#17 0x000055848f817ae6 llvm::FPPassManager::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1406:27
#18 0x000055848f81e432 llvm::FPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1452:13
#19 0x000055848f818255 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1521:27
#20 0x000055848f818255 llvm::legacy::PassManagerImpl::run(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:539:44
#21 0x000055848ff7ea6e llvm::TimeTraceScope::~TimeTraceScope() /workspace/llvm-project/llvm/include/llvm/Support/TimeProfiler.h:206:9
#22 0x000055848ff7ea6e (anonymous namespace)::EmitAssemblyHelper::RunCodegenPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1202:3
#23 0x000055848ff7ea6e (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1224:3
#24 0x000055848ff7ea6e clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1387:13
#25 0x00005584904f8c00 std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>::~unique_ptr() /usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/unique_ptr.h:397:6
#26 0x00005584904f8c00 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /workspace/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:354:3

@jmorse
Copy link
Member Author

jmorse commented Nov 14, 2024

I've managed to replicate it locally now (I must have been doing something wrong before), should be an easy fix once I've pinned it down, hopefully imminently.

@jmorse
Copy link
Member Author

jmorse commented Nov 14, 2024

Looks like this is a proper problem when there's no body to a function (not something I'd considered possible) -- I'll add a filter to emitInitialLocDirective once it's tested.

@mikaelholmen
Copy link
Collaborator

@hokein would there be any further context than that -- presumably UBSan can produce a stack trace? There are a lot of references to MachineInstr here, and not a lot to go on.

Unfortunately, usan-built clang didn't give a stacktrace. I see a normal assertion-enabled clang (built on bf483dd) also crashes, stacktrace:

$ ./bin/clang -O2 -g -c /t/t.ii                                                                                                                                                                                                                   <<<
/t/t.ii:1:10: warning: non-void function does not return a value [-Wreturn-type]
    1 | int a() {}
      |          ^
clang: /workspace/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:168: reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void, false, void>, false, true>::operator*() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void, false, void>, IsReverse = false, IsConst = true]: Assertion `!NodePtr->isKnownSentinel()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: ./bin/clang -O2 -g -c /t/t.ii
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module '/t/t.ii'.
4.      Running pass 'X86 Assembly Printer' on function '@_Z1av'
 #0 0x000055848fd8e988 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /workspace/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x000055848fd8ca90 llvm::sys::RunSignalHandlers() /workspace/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x000055848fd08326 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #3 0x000055848fd08326 CrashRecoverySignalHandler(int) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51
 #4 0x00007f09fac591a0 (/lib/x86_64-linux-gnu/libc.so.6+0x3d1a0)
 #5 0x00007f09faca70ec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f09fac59102 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #7 0x00007f09fac424f2 abort ./stdlib/abort.c:81:7
 #8 0x00007f09fac42415 _nl_load_domain ./intl/loadmsgcat.c:1177:9
 #9 0x00007f09fac51d32 (/lib/x86_64-linux-gnu/libc.so.6+0x35d32)
#10 0x0000558490a8ec98 findPrologueEndLoc(llvm::MachineFunction const*) /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:0:32
#11 0x0000558490a8ec98 llvm::DwarfDebug::emitInitialLocDirective(llvm::MachineFunction const&, unsigned int) /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2279:53
#12 0x0000558490a8f0c2 llvm::DwarfDebug::beginFunctionImpl(llvm::MachineFunction const*) /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2327:16
#13 0x0000558490a61b0e llvm::AsmPrinter::emitFunctionHeader() /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1057:37
#14 0x0000558490a645cd llvm::AsmPrinter::emitFunctionBody() /workspace/llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1768:3
#15 0x000055848ebddaf1 llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) /workspace/llvm-project/llvm/lib/Target/X86/X86AsmPrinter.cpp:91:3
#16 0x000055848f3788f6 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:94:13
#17 0x000055848f817ae6 llvm::FPPassManager::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1406:27
#18 0x000055848f81e432 llvm::FPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1452:13
#19 0x000055848f818255 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1521:27
#20 0x000055848f818255 llvm::legacy::PassManagerImpl::run(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:539:44
#21 0x000055848ff7ea6e llvm::TimeTraceScope::~TimeTraceScope() /workspace/llvm-project/llvm/include/llvm/Support/TimeProfiler.h:206:9
#22 0x000055848ff7ea6e (anonymous namespace)::EmitAssemblyHelper::RunCodegenPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1202:3
#23 0x000055848ff7ea6e (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1224:3
#24 0x000055848ff7ea6e clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1387:13
#25 0x00005584904f8c00 std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>::~unique_ptr() /usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/unique_ptr.h:397:6
#26 0x00005584904f8c00 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /workspace/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:354:3

We see this crash downstream as well. Is a fix far away?

@jmorse
Copy link
Member Author

jmorse commented Nov 14, 2024

Sorry for the bother; fixed in 251958f. @OCHyams could you do some post-commit review on that?

@mikaelholmen
Copy link
Collaborator

Sorry for the bother; fixed in 251958f. @OCHyams could you do some post-commit review on that?

Thank you! I've verified that the fix solves the problems we saw.

@OCHyams
Copy link
Contributor

OCHyams commented Nov 14, 2024

Sorry for the bother; fixed in 251958f. @OCHyams could you do some post-commit review on that?

LGTM (commented there too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants