Skip to content

Re-apply "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF" #107792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 9, 2024

Conversation

mingmingl-llvm
Copy link
Contributor

@mingmingl-llvm mingmingl-llvm commented Sep 9, 2024

Fix the use-after-free bug and re-apply #106193

  • Without the fix, the string referenced by objSym.Name could be destroyed even if string saver keeps a copy of the referenced string. This caused use-after-free.
  • The fix (latest commit) updates objSym.Name to reference (via StringRef) the string saver's copy.

Test:

  1. For lld/test/ELF/lto/asmundef.ll, its test failure is reproducible with -DLLVM_USE_SANITIZER=Address and gone with the fix.
  2. Run all tests by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes.

Original commit message

StringMap<T> creates a copy of the string for entry insertions and intentionally keep copies since the implementation optimizes string memory usage. On the other hand, linker keeps copies of symbol names [1] in lld::elf::parseFiles [2] before invoking compileBitcodeFiles [3].

This change proposes to optimize away string copies inside LTO::GlobalResolutions, which will make LTO indexing more memory efficient for ELF. There are similar opportunities for other (COFF, wasm, MachO) formats.

The optimization takes place for lld (ELF) only. For the rest of use cases (gold plugin, llvm-lto2, etc), LTO owns a string saver to keep copies and use global resolution key for de-duplication.

Together with @kazutakahirata's work to make ComputeCrossModuleImport more memory efficient, we see a ~20% peak memory usage reduction in a binary where peak memory usage needs to go down. Thanks to the optimization in 329ba52, the max (as opposed to the sum) of ComputeCrossModuleImport or GlobalResolution shows up in peak memory usage.

  • Regarding correctness, the set of resolved per-module symbols is a subset of llvm::lto::InputFile::Symbols. And bitcode symbol parsing saves symbol name when iterating obj->symbols in BitcodeFile::parse already. This change updates BitcodeFile::parseLazy to keep copies of per-module undefined symbols.
  • Presumably the undefined symbols in a LTO unit (copied in this patch in linker unique saver) is a small set compared with the set of symbols in global-resolution (copied before this patch), making this a worthwhile trade-off. Benchmarking this change alone shows measurable memory savings across various benchmarks.

[1] ELF

sym = symtab.insert(saver().save(objSym.getName()));

[2]
parseFiles(files, armCmseImpLib);

[3]
compileBitcodeFiles<ELFT>(skipLinkedOutput);

@mingmingl-llvm mingmingl-llvm marked this pull request as ready for review September 9, 2024 06:18
for (auto [i, irSym] : llvm::enumerate(obj->symbols())) {
// Keep copies of per-module undefined symbols for LTO::GlobalResolutions
// usage.
irSym.Name = unique_saver().save(irSym.getName());
if (!irSym.isUndefined()) {
// Symbols can be duplicated in bitcode files because of '#include' and
// linkonce_odr. Use unique_saver to save symbol names for de-duplication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unique_saver is now moved out, so should this comment be moved up too and merged with the "Keep copies of per-module undefined symbols" in some way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a couple of comment fixes

@@ -138,7 +138,7 @@ class InputFile {

/// The purpose of this class is to only expose the symbol information that an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/class/struct/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -1797,18 +1799,18 @@ void BitcodeFile::parse() {
void BitcodeFile::parseLazy() {
numSymbols = obj->symbols().size();
symbols = std::make_unique<Symbol *[]>(numSymbols);
for (auto [i, irSym] : llvm::enumerate(obj->symbols()))
for (auto [i, irSym] : llvm::enumerate(obj->symbols())) {
// Keep copies of per-module undefined symbols for LTO::GlobalResolutions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe update the comment a bit since it only talks about undefined symbols but now applies to both. Probably pull up the other comment below on line 1806 and combine them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@mingmingl-llvm mingmingl-llvm merged commit 09b231c into llvm:main Sep 9, 2024
5 of 8 checks passed
mingmingl-llvm added a commit that referenced this pull request Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants