Skip to content

Commit d0f9fad

Browse files
author
Tom Yang
committed
parallelize POSIX dyld RefreshModules
This diff parallelizes `DynamicLoaderPOSIXDYLD::RefreshModules`, which speeds up module loading on Linux. The major benefit of this is we can speed up symbol table indexing and parsing, which is the biggest bottleneck for targets which dynamically link many shared libraries. This speedup is only noticeable when **preloading** symbols. This is when `target.preload-symbols` is `true`, which is the default Meta. The symbol preload option tells the debugger to fully load all of the symbol tables when modules are loaded, as opposed to lazily loading when symbols are requested. Initially, I discovered the specific bottleneck by using the Linux `perf` tool. I saw that ~93% of samples were in `RefreshModules`, and mainly in `LoadModuleAtAddress` and `PreloadSymbols`. `LoadModuleAtAddress` appears independent and parallelize-able at first. The main issue is `DynamicLoaderPOSIXDYLD` maintains a map of loaded modules to their link addresses via `m_loaded_modules`. Modifying and reading to this map isn't thread-safe, so this diff also includes accessor methods that protect the map in the multithreaded context. Luckily, the critical section of modifying or reading from the map isn't super costly, so the contention doesn't appear to negatively impact performance. I tested with some larger projects with up to 15000 modules, and found significant performance improvements. Typically, I was seeing 2-3X launch speed increases, where "launch speed" is starting the binary and reaching `main`. I manually ran `ninja check-lldb` several times, and compared with the baseline. At this point, we're not seeing any new failures or new unresolved tests.
1 parent fbf25b1 commit d0f9fad

File tree

2 files changed

+101
-37
lines changed

2 files changed

+101
-37
lines changed

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp

Lines changed: 88 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#include "DynamicLoaderPOSIXDYLD.h"
1111

1212
#include "lldb/Breakpoint/BreakpointLocation.h"
13+
#include "lldb/Core/Debugger.h"
1314
#include "lldb/Core/Module.h"
1415
#include "lldb/Core/ModuleSpec.h"
1516
#include "lldb/Core/PluginManager.h"
@@ -25,6 +26,7 @@
2526
#include "lldb/Utility/LLDBLog.h"
2627
#include "lldb/Utility/Log.h"
2728
#include "lldb/Utility/ProcessInfo.h"
29+
#include "llvm/Support/ThreadPool.h"
2830

2931
#include <memory>
3032
#include <optional>
@@ -231,16 +233,37 @@ void DynamicLoaderPOSIXDYLD::DidLaunch() {
231233

232234
Status DynamicLoaderPOSIXDYLD::CanLoadImage() { return Status(); }
233235

236+
void DynamicLoaderPOSIXDYLD::SetLoadedModule(const ModuleSP &module_sp,
237+
addr_t link_map_addr) {
238+
std::unique_lock<std::shared_mutex> lock(m_loaded_modules_rw_mutex);
239+
m_loaded_modules[module_sp] = link_map_addr;
240+
}
241+
242+
void DynamicLoaderPOSIXDYLD::UnloadModule(const ModuleSP &module_sp) {
243+
std::unique_lock<std::shared_mutex> lock(m_loaded_modules_rw_mutex);
244+
m_loaded_modules.erase(module_sp);
245+
}
246+
247+
std::optional<lldb::addr_t>
248+
DynamicLoaderPOSIXDYLD::GetLoadedModuleLinkAddr(const ModuleSP &module_sp) {
249+
std::shared_lock<std::shared_mutex> lock(m_loaded_modules_rw_mutex);
250+
auto it = m_loaded_modules.find(module_sp);
251+
if (it != m_loaded_modules.end())
252+
return it->second;
253+
return std::nullopt;
254+
}
255+
234256
void DynamicLoaderPOSIXDYLD::UpdateLoadedSections(ModuleSP module,
235257
addr_t link_map_addr,
236258
addr_t base_addr,
237259
bool base_addr_is_offset) {
238-
m_loaded_modules[module] = link_map_addr;
260+
SetLoadedModule(module, link_map_addr);
261+
239262
UpdateLoadedSectionsCommon(module, base_addr, base_addr_is_offset);
240263
}
241264

242265
void DynamicLoaderPOSIXDYLD::UnloadSections(const ModuleSP module) {
243-
m_loaded_modules.erase(module);
266+
UnloadModule(module);
244267

245268
UnloadSectionsCommon(module);
246269
}
@@ -448,7 +471,7 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
448471
// The rendezvous class doesn't enumerate the main module, so track that
449472
// ourselves here.
450473
ModuleSP executable = GetTargetExecutable();
451-
m_loaded_modules[executable] = m_rendezvous.GetLinkMapAddress();
474+
SetLoadedModule(executable, m_rendezvous.GetLinkMapAddress());
452475

453476
DYLDRendezvous::iterator I;
454477
DYLDRendezvous::iterator E;
@@ -470,34 +493,66 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
470493
E = m_rendezvous.end();
471494
m_initial_modules_added = true;
472495
}
473-
for (; I != E; ++I) {
474-
// Don't load a duplicate copy of ld.so if we have already loaded it
475-
// earlier in LoadInterpreterModule. If we instead loaded then unloaded it
476-
// later, the section information for ld.so would be removed. That
477-
// information is required for placing breakpoints on Arm/Thumb systems.
478-
if ((m_interpreter_module.lock() != nullptr) &&
479-
(I->base_addr == m_interpreter_base))
480-
continue;
481-
482-
ModuleSP module_sp =
483-
LoadModuleAtAddress(I->file_spec, I->link_addr, I->base_addr, true);
484-
if (!module_sp.get())
485-
continue;
486-
487-
if (module_sp->GetObjectFile()->GetBaseAddress().GetLoadAddress(
488-
&m_process->GetTarget()) == m_interpreter_base) {
489-
ModuleSP interpreter_sp = m_interpreter_module.lock();
490-
if (m_interpreter_module.lock() == nullptr) {
491-
m_interpreter_module = module_sp;
492-
} else if (module_sp == interpreter_sp) {
493-
// Module already loaded.
494-
continue;
495-
}
496-
}
497496

498-
loaded_modules.AppendIfNeeded(module_sp);
499-
new_modules.Append(module_sp);
497+
std::mutex interpreter_module_mutex;
498+
// We should be able to take SOEntry as reference since the data
499+
// exists for the duration of this call in `m_rendezvous`.
500+
auto load_module_fn =
501+
[this, &loaded_modules, &new_modules,
502+
&interpreter_module_mutex](const DYLDRendezvous::SOEntry &so_entry) {
503+
// Don't load a duplicate copy of ld.so if we have already loaded it
504+
// earlier in LoadInterpreterModule. If we instead loaded then
505+
// unloaded it later, the section information for ld.so would be
506+
// removed. That information is required for placing breakpoints on
507+
// Arm/Thumb systems.
508+
{
509+
// `m_interpreter_module` may be modified by another thread at the
510+
// same time, so we guard the access here.
511+
std::lock_guard<std::mutex> lock(interpreter_module_mutex);
512+
if ((m_interpreter_module.lock() != nullptr) &&
513+
(so_entry.base_addr == m_interpreter_base))
514+
return;
515+
}
516+
517+
ModuleSP module_sp = LoadModuleAtAddress(
518+
so_entry.file_spec, so_entry.link_addr, so_entry.base_addr, true);
519+
if (!module_sp.get())
520+
return;
521+
522+
{
523+
// `m_interpreter_module` may be modified by another thread at the
524+
// same time, so we guard the access here.
525+
std::lock_guard<std::mutex> lock(interpreter_module_mutex);
526+
// Set the interpreter module, if this is the interpreter.
527+
if (module_sp->GetObjectFile()->GetBaseAddress().GetLoadAddress(
528+
&m_process->GetTarget()) == m_interpreter_base) {
529+
ModuleSP interpreter_sp = m_interpreter_module.lock();
530+
if (m_interpreter_module.lock() == nullptr) {
531+
m_interpreter_module = module_sp;
532+
} else if (module_sp == interpreter_sp) {
533+
// Module already loaded.
534+
return;
535+
}
536+
}
537+
}
538+
539+
loaded_modules.AppendIfNeeded(module_sp);
540+
new_modules.Append(module_sp);
541+
};
542+
543+
// Loading modules in parallel tends to be faster, but is still unstable.
544+
// Once it's stable, we can remove this setting and remove the serial
545+
// approach.
546+
if (GetGlobalPluginProperties().GetParallelModuleLoad()) {
547+
llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
548+
for (; I != E; ++I)
549+
task_group.async(load_module_fn, *I);
550+
task_group.wait();
551+
} else {
552+
for (; I != E; ++I)
553+
load_module_fn(*I);
500554
}
555+
501556
m_process->GetTarget().ModulesDidLoad(new_modules);
502557
}
503558

@@ -683,7 +738,7 @@ void DynamicLoaderPOSIXDYLD::LoadAllCurrentModules() {
683738
// The rendezvous class doesn't enumerate the main module, so track that
684739
// ourselves here.
685740
ModuleSP executable = GetTargetExecutable();
686-
m_loaded_modules[executable] = m_rendezvous.GetLinkMapAddress();
741+
SetLoadedModule(executable, m_rendezvous.GetLinkMapAddress());
687742

688743
std::vector<FileSpec> module_names;
689744
for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I)
@@ -775,15 +830,15 @@ DynamicLoaderPOSIXDYLD::GetThreadLocalData(const lldb::ModuleSP module_sp,
775830
const lldb::ThreadSP thread,
776831
lldb::addr_t tls_file_addr) {
777832
Log *log = GetLog(LLDBLog::DynamicLoader);
778-
auto it = m_loaded_modules.find(module_sp);
779-
if (it == m_loaded_modules.end()) {
833+
std::optional<addr_t> link_map_addr_opt = GetLoadedModuleLinkAddr(module_sp);
834+
if (!link_map_addr_opt.has_value()) {
780835
LLDB_LOGF(
781836
log, "GetThreadLocalData error: module(%s) not found in loaded modules",
782837
module_sp->GetObjectName().AsCString());
783838
return LLDB_INVALID_ADDRESS;
784839
}
785840

786-
addr_t link_map = it->second;
841+
addr_t link_map = link_map_addr_opt.value();
787842
if (link_map == LLDB_INVALID_ADDRESS || link_map == 0) {
788843
LLDB_LOGF(log,
789844
"GetThreadLocalData error: invalid link map address=0x%" PRIx64,

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,10 +95,6 @@ class DynamicLoaderPOSIXDYLD : public lldb_private::DynamicLoader {
9595
/// Contains the pointer to the interpret module, if loaded.
9696
std::weak_ptr<lldb_private::Module> m_interpreter_module;
9797

98-
/// Loaded module list. (link map for each module)
99-
std::map<lldb::ModuleWP, lldb::addr_t, std::owner_less<lldb::ModuleWP>>
100-
m_loaded_modules;
101-
10298
/// Returns true if the process is for a core file.
10399
bool IsCoreFile() const;
104100

@@ -182,6 +178,19 @@ class DynamicLoaderPOSIXDYLD : public lldb_private::DynamicLoader {
182178
DynamicLoaderPOSIXDYLD(const DynamicLoaderPOSIXDYLD &) = delete;
183179
const DynamicLoaderPOSIXDYLD &
184180
operator=(const DynamicLoaderPOSIXDYLD &) = delete;
181+
182+
/// Loaded module list. (link map for each module)
183+
/// This may be accessed in a multi-threaded context. Use the accessor methods
184+
/// to access `m_loaded_modules` safely.
185+
std::map<lldb::ModuleWP, lldb::addr_t, std::owner_less<lldb::ModuleWP>>
186+
m_loaded_modules;
187+
std::shared_mutex m_loaded_modules_rw_mutex;
188+
189+
void SetLoadedModule(const lldb::ModuleSP &module_sp,
190+
lldb::addr_t link_map_addr);
191+
void UnloadModule(const lldb::ModuleSP &module_sp);
192+
std::optional<lldb::addr_t>
193+
GetLoadedModuleLinkAddr(const lldb::ModuleSP &module_sp);
185194
};
186195

187196
#endif // LLDB_SOURCE_PLUGINS_DYNAMICLOADER_POSIX_DYLD_DYNAMICLOADERPOSIXDYLD_H

0 commit comments

Comments
 (0)