Skip to content

[DebugInfo] Fully implement DWARF issue 180201.1 #149226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 21, 2025

Conversation

slinder1
Copy link
Contributor

Finish making LLVM's implementation of DW_LNCT_LLVM_source conform to the final accepted version of DW_LNCT_source from
https://dwarfstd.org/issues/180201.1.html

This is effectively a continuation of a few commits which have moved in this direction already, including:

  • c9cb4fc [DebugInfo] Store optional DIFile::Source as pointer
  • 87e22bd Allow for mixing source/no-source DIFiles in one CU

This patch:

  • Teaches LLParser that there is a distinction between an empty and an absent "source:" field on DIFile.
  • Makes printing the "source:" field in AsmWriter conditional on it being present, instead of being conditional on it being non-empty.
  • Teaches MC to map an empty-but-present source field to "\n" (which is ambiguous, making the source strings "" and "\n" indistinguishable, but that's what the DWARF issue specifies).

Add a test for round-tripping an empty source field through assembler/bitcode.

Extend the test for the actual DWARF generation so it covers all of the cases (absent, present-but-empty, present-and-ambiguously-single-newline, present).

Finish making LLVM's implementation of `DW_LNCT_LLVM_source` conform to the
final accepted version of `DW_LNCT_source` from
https://dwarfstd.org/issues/180201.1.html

This is effectively a continuation of a few commits which have moved in this
direction already, including:

c9cb4fc [DebugInfo] Store optional DIFile::Source as pointer
87e22bd Allow for mixing source/no-source DIFiles in one CU

This patch:

* Teaches LLParser that there is a distinction between an empty and an absent
  "source:" field on DIFile.
* Makes printing the "source:" field in AsmWriter conditional on it
  being present, instead of being conditional on it being non-empty.
* Teaches MC to map an empty-but-present source field to "\n" (which is
  ambiguous, making the source strings "" and "\n" indistinguishable,
  but that's what the DWARF issue specifies).

Add a test for round-tripping an empty source field through assembler/bitcode.

Extend the test for the actual DWARF generation so it covers all of the cases
(absent, present-but-empty, present-and-ambiguously-single-newline, present).
@llvmbot llvmbot added debuginfo mc Machine (object) code llvm:ir labels Jul 16, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2025

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-ir

Author: Scott Linder (slinder1)

Changes

Finish making LLVM's implementation of DW_LNCT_LLVM_source conform to the final accepted version of DW_LNCT_source from
https://dwarfstd.org/issues/180201.1.html

This is effectively a continuation of a few commits which have moved in this direction already, including:

  • c9cb4fc [DebugInfo] Store optional DIFile::Source as pointer
  • 87e22bd Allow for mixing source/no-source DIFiles in one CU

This patch:

  • Teaches LLParser that there is a distinction between an empty and an absent "source:" field on DIFile.
  • Makes printing the "source:" field in AsmWriter conditional on it being present, instead of being conditional on it being non-empty.
  • Teaches MC to map an empty-but-present source field to "\n" (which is ambiguous, making the source strings "" and "\n" indistinguishable, but that's what the DWARF issue specifies).

Add a test for round-tripping an empty source field through assembler/bitcode.

Extend the test for the actual DWARF generation so it covers all of the cases (absent, present-but-empty, present-and-ambiguously-single-newline, present).


Full diff: https://github.com/llvm/llvm-project/pull/149226.diff

5 Files Affected:

  • (modified) llvm/lib/AsmParser/LLParser.cpp (+21-8)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+3-2)
  • (modified) llvm/lib/MC/MCDwarf.cpp (+9-2)
  • (added) llvm/test/Assembler/difile-empty-source.ll (+12)
  • (modified) llvm/test/DebugInfo/Generic/mixed-source.ll (+44-14)
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index b7f6950f679ef..ebdac354179f2 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -4783,9 +4783,13 @@ struct MDField : public MDFieldImpl<Metadata *> {
 };
 
 struct MDStringField : public MDFieldImpl<MDString *> {
-  bool AllowEmpty;
-  MDStringField(bool AllowEmpty = true)
-      : ImplTy(nullptr), AllowEmpty(AllowEmpty) {}
+  enum class EmptyIs {
+    Null,  //< Allow empty input string, map to nullptr
+    Empty, //< Allow empty input string, map to an empty MDString
+    Error, //< Disallow empty string, map to an error
+  } EmptyIs;
+  MDStringField(enum EmptyIs EmptyIs = EmptyIs::Null)
+      : ImplTy(nullptr), EmptyIs(EmptyIs) {}
 };
 
 struct MDFieldList : public MDFieldImpl<SmallVector<Metadata *, 4>> {
@@ -5257,10 +5261,19 @@ bool LLParser::parseMDField(LocTy Loc, StringRef Name, MDStringField &Result) {
   if (parseStringConstant(S))
     return true;
 
-  if (!Result.AllowEmpty && S.empty())
-    return error(ValueLoc, "'" + Name + "' cannot be empty");
+  if (S.empty()) {
+    switch (Result.EmptyIs) {
+    case MDStringField::EmptyIs::Null:
+      Result.assign(nullptr);
+      return false;
+    case MDStringField::EmptyIs::Empty:
+      break;
+    case MDStringField::EmptyIs::Error:
+      return error(ValueLoc, "'" + Name + "' cannot be empty");
+    }
+  }
 
-  Result.assign(S.empty() ? nullptr : MDString::get(Context, S));
+  Result.assign(MDString::get(Context, S));
   return false;
 }
 
@@ -5778,7 +5791,7 @@ bool LLParser::parseDIFile(MDNode *&Result, bool IsDistinct) {
   REQUIRED(directory, MDStringField, );                                        \
   OPTIONAL(checksumkind, ChecksumKindField, (DIFile::CSK_MD5));                \
   OPTIONAL(checksum, MDStringField, );                                         \
-  OPTIONAL(source, MDStringField, );
+  OPTIONAL(source, MDStringField, (MDStringField::EmptyIs::Empty));
   PARSE_MD_FIELDS();
 #undef VISIT_MD_FIELDS
 
@@ -6062,7 +6075,7 @@ bool LLParser::parseDITemplateValueParameter(MDNode *&Result, bool IsDistinct) {
 ///                         declaration: !4, align: 8)
 bool LLParser::parseDIGlobalVariable(MDNode *&Result, bool IsDistinct) {
 #define VISIT_MD_FIELDS(OPTIONAL, REQUIRED)                                    \
-  OPTIONAL(name, MDStringField, (/* AllowEmpty */ false));                     \
+  OPTIONAL(name, MDStringField, (MDStringField::EmptyIs::Error));              \
   OPTIONAL(scope, MDField, );                                                  \
   OPTIONAL(linkageName, MDStringField, );                                      \
   OPTIONAL(file, MDField, );                                                   \
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 145ef10f28f35..18d67f0c3a006 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -2398,8 +2398,9 @@ static void writeDIFile(raw_ostream &Out, const DIFile *N, AsmWriterContext &) {
   // Print all values for checksum together, or not at all.
   if (N->getChecksum())
     Printer.printChecksum(*N->getChecksum());
-  Printer.printString("source", N->getSource().value_or(StringRef()),
-                      /* ShouldSkipEmpty */ true);
+  if (N->getSource())
+    Printer.printString("source", *N->getSource(),
+                        /* ShouldSkipEmpty */ false);
   Out << ")";
 }
 
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index b1dced7e2cda3..e7c0d37e8f99b 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -447,10 +447,17 @@ static void emitOneV5FileEntry(MCStreamer *MCOS, const MCDwarfFile &DwarfFile,
         StringRef(reinterpret_cast<const char *>(Cksum.data()), Cksum.size()));
   }
   if (HasAnySource) {
+    // From https://dwarfstd.org/issues/180201.1.html
+    // * The value is an empty null-terminated string if no source is available
+    StringRef Source = DwarfFile.Source.value_or(StringRef());
+    // * If the source is available but is an empty file then the value is a
+    // null-terminated single "\n".
+    if (DwarfFile.Source && DwarfFile.Source->empty())
+      Source = "\n";
     if (LineStr)
-      LineStr->emitRef(MCOS, DwarfFile.Source.value_or(StringRef()));
+      LineStr->emitRef(MCOS, Source);
     else {
-      MCOS->emitBytes(DwarfFile.Source.value_or(StringRef())); // Source and...
+      MCOS->emitBytes(Source);             // Source and...
       MCOS->emitBytes(StringRef("\0", 1)); // its null terminator.
     }
   }
diff --git a/llvm/test/Assembler/difile-empty-source.ll b/llvm/test/Assembler/difile-empty-source.ll
new file mode 100644
index 0000000000000..11587d8f1afdc
--- /dev/null
+++ b/llvm/test/Assembler/difile-empty-source.ll
@@ -0,0 +1,12 @@
+; RUN: llvm-as < %s | llvm-dis | llvm-as | llvm-dis | FileCheck %s
+; RUN: verify-uselistorder
+
+; CHECK: !DIFile({{.*}}, source: "")
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, emissionKind: FullDebug)
+!1 = !DIFile(filename: "-", directory: "/", checksumkind: CSK_MD5, checksum: "d41d8cd98f00b204e9800998ecf8427e", source: "")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
diff --git a/llvm/test/DebugInfo/Generic/mixed-source.ll b/llvm/test/DebugInfo/Generic/mixed-source.ll
index d5586f8fc5389..ee3598f1fdd13 100644
--- a/llvm/test/DebugInfo/Generic/mixed-source.ll
+++ b/llvm/test/DebugInfo/Generic/mixed-source.ll
@@ -5,36 +5,66 @@
 
 ; CHECK: include_directories[  0] = "dir"
 ; CHECK-NEXT: file_names[  0]:
+; CHECK-NEXT:            name: "main.c"
+; CHECK-NEXT:       dir_index: 0
+; CHECK-NOT:           source:
+; CHECK-NEXT: file_names[  1]:
 ; CHECK-NEXT:            name: "foo.c"
 ; CHECK-NEXT:       dir_index: 0
 ; CHECK-NEXT:          source: "void foo() { }\n"
-; CHECK-NEXT: file_names[  1]:
-; CHECK-NEXT:            name: "bar.h"
+; CHECK-NEXT: file_names[  2]:
+; CHECK-NEXT:            name: "newline.h"
+; CHECK-NEXT:       dir_index: 0
+; CHECK-NEXT:          source: "\n"
+; CHECK-NEXT: file_names[  3]:
+; CHECK-NEXT:            name: "empty.h"
+; CHECK-NEXT:       dir_index: 0
+; CHECK-NEXT:          source: "\n"
+; CHECK-NEXT: file_names[  4]:
+; CHECK-NEXT:            name: "absent.h"
 ; CHECK-NEXT:       dir_index: 0
 ; CHECK-NOT:           source:
 
 ; Test that DIFiles mixing source and no-source within a DICompileUnit works.
 
-define dso_local void @foo() !dbg !5 {
+define dso_local void @foo() !dbg !6 {
   ret void, !dbg !7
 }
 
-define dso_local void @bar() !dbg !6 {
-  ret void, !dbg !8
+define dso_local void @newline() !dbg !9 {
+  ret void, !dbg !10
 }
 
-!llvm.dbg.cu = !{!4}
+define dso_local void @empty() !dbg !12 {
+  ret void, !dbg !13
+}
+
+define dso_local void @absent() !dbg !15 {
+  ret void, !dbg !16
+}
+
+!llvm.dbg.cu = !{!2}
 !llvm.module.flags = !{!0, !1}
 
 !0 = !{i32 2, !"Dwarf Version", i32 5}
 !1 = !{i32 2, !"Debug Info Version", i32 3}
 
-!2 = !DIFile(filename: "foo.c", directory: "dir", source: "void foo() { }\0A")
-!3 = !DIFile(filename: "bar.h", directory: "dir")
+!2 = distinct !DICompileUnit(language: DW_LANG_C99, emissionKind: FullDebug, file: !4)
+!3 = !DISubroutineType(types: !{})
+!4 = !DIFile(filename: "main.c", directory: "dir")
+
+!5 = !DIFile(filename: "foo.c", directory: "dir", source: "void foo() { }\0A")
+!6 = distinct !DISubprogram(name: "foo", file: !5, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!7 = !DILocation(line: 1, scope: !6)
+
+!8 = !DIFile(filename: "newline.h", directory: "dir", source: "\0A")
+!9 = distinct !DISubprogram(name: "newline", file: !8, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!10 = !DILocation(line: 1, scope: !9)
+
+!11 = !DIFile(filename: "empty.h", directory: "dir", source: "")
+!12 = distinct !DISubprogram(name: "empty", file: !11, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!13 = !DILocation(line: 1, scope: !12)
 
-!4 = distinct !DICompileUnit(language: DW_LANG_C99, emissionKind: FullDebug, file: !2)
-!5 = distinct !DISubprogram(name: "foo", file: !2, line: 1, type: !9, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !4)
-!6 = distinct !DISubprogram(name: "bar", file: !3, line: 1, type: !9, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !4)
-!7 = !DILocation(line: 1, scope: !5)
-!8 = !DILocation(line: 1, scope: !6)
-!9 = !DISubroutineType(types: !{})
+!14 = !DIFile(filename: "absent.h", directory: "dir")
+!15 = distinct !DISubprogram(name: "absent", file: !14, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!16 = !DILocation(line: 1, scope: !15)

@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2025

@llvm/pr-subscribers-mc

Author: Scott Linder (slinder1)

Changes

Finish making LLVM's implementation of DW_LNCT_LLVM_source conform to the final accepted version of DW_LNCT_source from
https://dwarfstd.org/issues/180201.1.html

This is effectively a continuation of a few commits which have moved in this direction already, including:

  • c9cb4fc [DebugInfo] Store optional DIFile::Source as pointer
  • 87e22bd Allow for mixing source/no-source DIFiles in one CU

This patch:

  • Teaches LLParser that there is a distinction between an empty and an absent "source:" field on DIFile.
  • Makes printing the "source:" field in AsmWriter conditional on it being present, instead of being conditional on it being non-empty.
  • Teaches MC to map an empty-but-present source field to "\n" (which is ambiguous, making the source strings "" and "\n" indistinguishable, but that's what the DWARF issue specifies).

Add a test for round-tripping an empty source field through assembler/bitcode.

Extend the test for the actual DWARF generation so it covers all of the cases (absent, present-but-empty, present-and-ambiguously-single-newline, present).


Full diff: https://github.com/llvm/llvm-project/pull/149226.diff

5 Files Affected:

  • (modified) llvm/lib/AsmParser/LLParser.cpp (+21-8)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+3-2)
  • (modified) llvm/lib/MC/MCDwarf.cpp (+9-2)
  • (added) llvm/test/Assembler/difile-empty-source.ll (+12)
  • (modified) llvm/test/DebugInfo/Generic/mixed-source.ll (+44-14)
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index b7f6950f679ef..ebdac354179f2 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -4783,9 +4783,13 @@ struct MDField : public MDFieldImpl<Metadata *> {
 };
 
 struct MDStringField : public MDFieldImpl<MDString *> {
-  bool AllowEmpty;
-  MDStringField(bool AllowEmpty = true)
-      : ImplTy(nullptr), AllowEmpty(AllowEmpty) {}
+  enum class EmptyIs {
+    Null,  //< Allow empty input string, map to nullptr
+    Empty, //< Allow empty input string, map to an empty MDString
+    Error, //< Disallow empty string, map to an error
+  } EmptyIs;
+  MDStringField(enum EmptyIs EmptyIs = EmptyIs::Null)
+      : ImplTy(nullptr), EmptyIs(EmptyIs) {}
 };
 
 struct MDFieldList : public MDFieldImpl<SmallVector<Metadata *, 4>> {
@@ -5257,10 +5261,19 @@ bool LLParser::parseMDField(LocTy Loc, StringRef Name, MDStringField &Result) {
   if (parseStringConstant(S))
     return true;
 
-  if (!Result.AllowEmpty && S.empty())
-    return error(ValueLoc, "'" + Name + "' cannot be empty");
+  if (S.empty()) {
+    switch (Result.EmptyIs) {
+    case MDStringField::EmptyIs::Null:
+      Result.assign(nullptr);
+      return false;
+    case MDStringField::EmptyIs::Empty:
+      break;
+    case MDStringField::EmptyIs::Error:
+      return error(ValueLoc, "'" + Name + "' cannot be empty");
+    }
+  }
 
-  Result.assign(S.empty() ? nullptr : MDString::get(Context, S));
+  Result.assign(MDString::get(Context, S));
   return false;
 }
 
@@ -5778,7 +5791,7 @@ bool LLParser::parseDIFile(MDNode *&Result, bool IsDistinct) {
   REQUIRED(directory, MDStringField, );                                        \
   OPTIONAL(checksumkind, ChecksumKindField, (DIFile::CSK_MD5));                \
   OPTIONAL(checksum, MDStringField, );                                         \
-  OPTIONAL(source, MDStringField, );
+  OPTIONAL(source, MDStringField, (MDStringField::EmptyIs::Empty));
   PARSE_MD_FIELDS();
 #undef VISIT_MD_FIELDS
 
@@ -6062,7 +6075,7 @@ bool LLParser::parseDITemplateValueParameter(MDNode *&Result, bool IsDistinct) {
 ///                         declaration: !4, align: 8)
 bool LLParser::parseDIGlobalVariable(MDNode *&Result, bool IsDistinct) {
 #define VISIT_MD_FIELDS(OPTIONAL, REQUIRED)                                    \
-  OPTIONAL(name, MDStringField, (/* AllowEmpty */ false));                     \
+  OPTIONAL(name, MDStringField, (MDStringField::EmptyIs::Error));              \
   OPTIONAL(scope, MDField, );                                                  \
   OPTIONAL(linkageName, MDStringField, );                                      \
   OPTIONAL(file, MDField, );                                                   \
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 145ef10f28f35..18d67f0c3a006 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -2398,8 +2398,9 @@ static void writeDIFile(raw_ostream &Out, const DIFile *N, AsmWriterContext &) {
   // Print all values for checksum together, or not at all.
   if (N->getChecksum())
     Printer.printChecksum(*N->getChecksum());
-  Printer.printString("source", N->getSource().value_or(StringRef()),
-                      /* ShouldSkipEmpty */ true);
+  if (N->getSource())
+    Printer.printString("source", *N->getSource(),
+                        /* ShouldSkipEmpty */ false);
   Out << ")";
 }
 
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index b1dced7e2cda3..e7c0d37e8f99b 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -447,10 +447,17 @@ static void emitOneV5FileEntry(MCStreamer *MCOS, const MCDwarfFile &DwarfFile,
         StringRef(reinterpret_cast<const char *>(Cksum.data()), Cksum.size()));
   }
   if (HasAnySource) {
+    // From https://dwarfstd.org/issues/180201.1.html
+    // * The value is an empty null-terminated string if no source is available
+    StringRef Source = DwarfFile.Source.value_or(StringRef());
+    // * If the source is available but is an empty file then the value is a
+    // null-terminated single "\n".
+    if (DwarfFile.Source && DwarfFile.Source->empty())
+      Source = "\n";
     if (LineStr)
-      LineStr->emitRef(MCOS, DwarfFile.Source.value_or(StringRef()));
+      LineStr->emitRef(MCOS, Source);
     else {
-      MCOS->emitBytes(DwarfFile.Source.value_or(StringRef())); // Source and...
+      MCOS->emitBytes(Source);             // Source and...
       MCOS->emitBytes(StringRef("\0", 1)); // its null terminator.
     }
   }
diff --git a/llvm/test/Assembler/difile-empty-source.ll b/llvm/test/Assembler/difile-empty-source.ll
new file mode 100644
index 0000000000000..11587d8f1afdc
--- /dev/null
+++ b/llvm/test/Assembler/difile-empty-source.ll
@@ -0,0 +1,12 @@
+; RUN: llvm-as < %s | llvm-dis | llvm-as | llvm-dis | FileCheck %s
+; RUN: verify-uselistorder
+
+; CHECK: !DIFile({{.*}}, source: "")
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, emissionKind: FullDebug)
+!1 = !DIFile(filename: "-", directory: "/", checksumkind: CSK_MD5, checksum: "d41d8cd98f00b204e9800998ecf8427e", source: "")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
diff --git a/llvm/test/DebugInfo/Generic/mixed-source.ll b/llvm/test/DebugInfo/Generic/mixed-source.ll
index d5586f8fc5389..ee3598f1fdd13 100644
--- a/llvm/test/DebugInfo/Generic/mixed-source.ll
+++ b/llvm/test/DebugInfo/Generic/mixed-source.ll
@@ -5,36 +5,66 @@
 
 ; CHECK: include_directories[  0] = "dir"
 ; CHECK-NEXT: file_names[  0]:
+; CHECK-NEXT:            name: "main.c"
+; CHECK-NEXT:       dir_index: 0
+; CHECK-NOT:           source:
+; CHECK-NEXT: file_names[  1]:
 ; CHECK-NEXT:            name: "foo.c"
 ; CHECK-NEXT:       dir_index: 0
 ; CHECK-NEXT:          source: "void foo() { }\n"
-; CHECK-NEXT: file_names[  1]:
-; CHECK-NEXT:            name: "bar.h"
+; CHECK-NEXT: file_names[  2]:
+; CHECK-NEXT:            name: "newline.h"
+; CHECK-NEXT:       dir_index: 0
+; CHECK-NEXT:          source: "\n"
+; CHECK-NEXT: file_names[  3]:
+; CHECK-NEXT:            name: "empty.h"
+; CHECK-NEXT:       dir_index: 0
+; CHECK-NEXT:          source: "\n"
+; CHECK-NEXT: file_names[  4]:
+; CHECK-NEXT:            name: "absent.h"
 ; CHECK-NEXT:       dir_index: 0
 ; CHECK-NOT:           source:
 
 ; Test that DIFiles mixing source and no-source within a DICompileUnit works.
 
-define dso_local void @foo() !dbg !5 {
+define dso_local void @foo() !dbg !6 {
   ret void, !dbg !7
 }
 
-define dso_local void @bar() !dbg !6 {
-  ret void, !dbg !8
+define dso_local void @newline() !dbg !9 {
+  ret void, !dbg !10
 }
 
-!llvm.dbg.cu = !{!4}
+define dso_local void @empty() !dbg !12 {
+  ret void, !dbg !13
+}
+
+define dso_local void @absent() !dbg !15 {
+  ret void, !dbg !16
+}
+
+!llvm.dbg.cu = !{!2}
 !llvm.module.flags = !{!0, !1}
 
 !0 = !{i32 2, !"Dwarf Version", i32 5}
 !1 = !{i32 2, !"Debug Info Version", i32 3}
 
-!2 = !DIFile(filename: "foo.c", directory: "dir", source: "void foo() { }\0A")
-!3 = !DIFile(filename: "bar.h", directory: "dir")
+!2 = distinct !DICompileUnit(language: DW_LANG_C99, emissionKind: FullDebug, file: !4)
+!3 = !DISubroutineType(types: !{})
+!4 = !DIFile(filename: "main.c", directory: "dir")
+
+!5 = !DIFile(filename: "foo.c", directory: "dir", source: "void foo() { }\0A")
+!6 = distinct !DISubprogram(name: "foo", file: !5, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!7 = !DILocation(line: 1, scope: !6)
+
+!8 = !DIFile(filename: "newline.h", directory: "dir", source: "\0A")
+!9 = distinct !DISubprogram(name: "newline", file: !8, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!10 = !DILocation(line: 1, scope: !9)
+
+!11 = !DIFile(filename: "empty.h", directory: "dir", source: "")
+!12 = distinct !DISubprogram(name: "empty", file: !11, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!13 = !DILocation(line: 1, scope: !12)
 
-!4 = distinct !DICompileUnit(language: DW_LANG_C99, emissionKind: FullDebug, file: !2)
-!5 = distinct !DISubprogram(name: "foo", file: !2, line: 1, type: !9, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !4)
-!6 = distinct !DISubprogram(name: "bar", file: !3, line: 1, type: !9, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !4)
-!7 = !DILocation(line: 1, scope: !5)
-!8 = !DILocation(line: 1, scope: !6)
-!9 = !DISubroutineType(types: !{})
+!14 = !DIFile(filename: "absent.h", directory: "dir")
+!15 = distinct !DISubprogram(name: "absent", file: !14, line: 1, type: !3, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !2)
+!16 = !DILocation(line: 1, scope: !15)

@hahnjo hahnjo removed their request for review July 21, 2025 07:20
@hahnjo
Copy link
Member

hahnjo commented Jul 21, 2025

Not really knowledgeable for reviewing DWARF changes

@jmorse
Copy link
Member

jmorse commented Jul 21, 2025

This broadly LGTM; it'd be great if someone with more context looks at it too.

(The empty-string versus "\n" ambiguity is awkward; but it's also explicit in the spec, so presumably there's a reason for doing it).

@pogo59
Copy link
Collaborator

pogo59 commented Jul 21, 2025

(The empty-string versus "\n" ambiguity is awkward; but it's also explicit in the spec, so presumably there's a reason for doing it).

I remember the topic, although not the precise reasoning; had something to do with how a toolchain might generate source on the fly. Some languages mandate that source ends with a newline, or something along those lines.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slinder1
Copy link
Contributor Author

I agree the ambiguity isn't ideal, but the cases where I could imagine it being an issue seem unlikely to come up in practice.

Thinking about this just now, if one includes a checksum alongside the embedded source it seems like it is actually enough to disambiguate it. The consumer can compare against the "null" hash digest, e.g. printf '' | md5sum - # d41d8cd98f00b204e9800998ecf8427e

@slinder1 slinder1 merged commit 354944d into llvm:main Jul 21, 2025
13 checks passed
@slinder1 slinder1 deleted the difile-source-empty branch July 21, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debuginfo llvm:ir mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants