Skip to content

Conversation

aeubanks
Copy link
Contributor

@aeubanks aeubanks commented Aug 7, 2024

Useful with other infrastructure that consume LLVM statistics to get an idea of distribution of section sizes.

The breakdown of various section types is subject to change, this is just an initial go at gather some sort of stats.

Example stats compiling X86ISelLowering.cpp (-g1):

        "elf-object-writer.AllocROBytes": 308268,
        "elf-object-writer.AllocRWBytes": 6240,
        "elf-object-writer.AllocTextBytes": 1659203,
        "elf-object-writer.DebugBytes": 3180386,
        "elf-object-writer.OtherBytes": 5862,
        "elf-object-writer.RelocationBytes": 2623440,
        "elf-object-writer.StrtabBytes": 228599,
        "elf-object-writer.SymtabBytes": 120336,
        "elf-object-writer.UnwindBytes": 85216,

Useful with other infrastructure that consume LLVM statistics to get an
idea of distribution of section sizes.

The breakdown of various section types is subject to change, this is
just an initial go at gather some sort of stats.
@aeubanks aeubanks requested review from rnk and zmodem August 7, 2024 20:33
@llvmbot llvmbot added backend:X86 llvm:mc Machine (object) code labels Aug 7, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-mc

Author: Arthur Eubanks (aeubanks)

Changes

Useful with other infrastructure that consume LLVM statistics to get an idea of distribution of section sizes.

The breakdown of various section types is subject to change, this is just an initial go at gather some sort of stats.


Full diff: https://github.com/llvm/llvm-project/pull/102363.diff

2 Files Affected:

  • (modified) llvm/lib/MC/ELFObjectWriter.cpp (+48-2)
  • (added) llvm/test/CodeGen/X86/section-stats.ll (+7)
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index c40a074137ab2..4d871e4c4bf5e 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -14,6 +14,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/Statistic.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
@@ -62,10 +63,23 @@
 
 using namespace llvm;
 
-#undef  DEBUG_TYPE
-#define DEBUG_TYPE "reloc-info"
+#define DEBUG_TYPE "elf-object-writer"
 
 namespace {
+namespace stats {
+
+STATISTIC(AllocTextBytes, "Total size of SHF_ALLOC text sections");
+STATISTIC(AllocROBytes, "Total size of SHF_ALLOC readonly sections");
+STATISTIC(AllocRWBytes, "Total size of SHF_ALLOC read-write sections");
+STATISTIC(StrtabBytes, "Total size of SHT_STRTAB sections");
+STATISTIC(SymtabBytes, "Total size of SHT_SYMTAB sections");
+STATISTIC(RelocationBytes, "Total size of relocation sections");
+STATISTIC(DynsymBytes, "Total size of SHT_DYNSYM sections");
+STATISTIC(DebugBytes, "Total size of debug info sections");
+STATISTIC(UnwindBytes, "Total size of unwind sections");
+STATISTIC(OtherBytes, "Total size of uncategorized sections");
+
+} // namespace stats
 
 struct ELFWriter;
 
@@ -951,6 +965,38 @@ void ELFWriter::writeSectionHeader(const MCAssembler &Asm) {
     else
       Size = Offsets.second - Offsets.first;
 
+    auto SectionHasFlag = [&](uint64_t Flag) -> bool {
+      return Section->getFlags() & Flag;
+    };
+    auto SectionIsType = [&](uint64_t Type) -> bool {
+      return Section->getType() == Type;
+    };
+
+    if (SectionIsType(ELF::SHT_STRTAB)) {
+      stats::StrtabBytes += Size;
+    } else if (SectionIsType(ELF::SHT_SYMTAB)) {
+      stats::SymtabBytes += Size;
+    } else if (SectionIsType(ELF::SHT_DYNSYM)) {
+      stats::DynsymBytes += Size;
+    } else if (SectionIsType(ELF::SHT_REL) || SectionIsType(ELF::SHT_RELA) ||
+               SectionIsType(ELF::SHT_RELR) || SectionIsType(ELF::SHT_CREL)) {
+      stats::RelocationBytes += Size;
+    } else if (SectionIsType(ELF::SHT_X86_64_UNWIND)) {
+      stats::UnwindBytes += Size;
+    } else if (Section->getName().starts_with(".debug")) {
+      stats::DebugBytes += Size;
+    } else if (SectionHasFlag(ELF::SHF_ALLOC)) {
+      if (SectionHasFlag(ELF::SHF_EXECINSTR)) {
+        stats::AllocTextBytes += Size;
+      } else if (SectionHasFlag(ELF::SHF_WRITE)) {
+        stats::AllocRWBytes += Size;
+      } else {
+        stats::AllocROBytes += Size;
+      }
+    } else {
+      stats::OtherBytes += Size;
+    }
+
     writeSection(GroupSymbolIndex, Offsets.first, Size, *Section);
   }
 }
diff --git a/llvm/test/CodeGen/X86/section-stats.ll b/llvm/test/CodeGen/X86/section-stats.ll
new file mode 100644
index 0000000000000..26d9cf2d60676
--- /dev/null
+++ b/llvm/test/CodeGen/X86/section-stats.ll
@@ -0,0 +1,7 @@
+; RUN: llc -o /dev/null -filetype=obj -stats %s 2>&1 | FileCheck %s
+
+; CHECK: {{[0-9+]}} elf-object-writer - Total size of SHF_ALLOC text sections
+
+define void @f() {
+    ret void
+}

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. Can you pick some LLVM object file of interest, say X86ISelLowering.cpp.o, and paste the stats output for that file, just to get a sense of what the data looks like?

I think this looks good, but I'd like a second reviewer opinion.

@aeubanks
Copy link
Contributor Author

aeubanks commented Aug 8, 2024

added numbers in the description

Copy link
Collaborator

@zmodem zmodem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

stats::UnwindBytes += Size;
} else if (Section->getName().starts_with(".debug")) {
stats::DebugBytes += Size;
} else if (SectionHasFlag(ELF::SHF_ALLOC)) {
Copy link
Member

@MaskRay MaskRay Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps check SHF_ALLOC first, then .debug, then use a switch on getType()? Then, SectionIsType could be removed.

Copy link
Contributor Author

@aeubanks aeubanks Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I moved the check for eh_frame first since it's also SHF_ALLOC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

== ".eh_frame"

The design makes it monolithic.

} else if (SectionIsType(ELF::SHT_DYNSYM)) {
stats::DynsymBytes += Size;
} else if (SectionIsType(ELF::SHT_REL) || SectionIsType(ELF::SHT_RELA) ||
SectionIsType(ELF::SHT_RELR) || SectionIsType(ELF::SHT_CREL)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object writer doesn't generate SHT_RELR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} else if (SectionIsType(ELF::SHT_REL) || SectionIsType(ELF::SHT_RELA) ||
SectionIsType(ELF::SHT_RELR) || SectionIsType(ELF::SHT_CREL)) {
stats::RelocationBytes += Size;
} else if (SectionIsType(ELF::SHT_X86_64_UNWIND)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-x86-64 arches don't use SHT_X86_64_UNWIND. Perhaps just do a name comparison with .eh_frame

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

; REQUIRES: asserts
; RUN: llc -o /dev/null -filetype=obj -stats %s 2>&1 | FileCheck %s

; CHECK: {{[0-9+]}} elf-object-writer - Total size of SHF_ALLOC text sections
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[[#]]. I think it's fine to test the exact number to ensure that we don't regress.

You can add a variable and test SHF_WRITE as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@aeubanks aeubanks merged commit 1baa6f7 into llvm:main Aug 8, 2024
4 of 6 checks passed
@aeubanks aeubanks deleted the section-stats branch August 8, 2024 21:14
aeubanks added a commit that referenced this pull request Sep 23, 2024
)

Followup to #102363. This makes the `elf-object-writer.*Bytes` stats sum
up to `assembler.ObjectBytes`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants