Skip to content

A crash related to ElfParser::loadSymbolTable #191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yanglong1010 opened this issue Mar 6, 2025 · 4 comments
Closed

A crash related to ElfParser::loadSymbolTable #191

yanglong1010 opened this issue Mar 6, 2025 · 4 comments

Comments

@yanglong1010
Copy link
Contributor

Hi,

I encountered a crash today, after some investigation, I think I have found the reason.

I ran java-profiler using the command bellow. Run with Datadog Java agent can trigger this crash too (not tested).

/usr/lib/jvm/java-8-openjdk-amd64/bin/java -agentpath:/root/java-profiler/ddprof-lib/build/lib/main/release/linux/x64/libjavaProfiler.so=start,cpu=10ms,file=/tmp/ap.jfr -cp java Demo (Any Java code can reproduce)

openjdk version "1.8.0_442"
OpenJDK Runtime Environment (build 1.8.0_442-8u442-b06~us1-0ubuntu1~24.04-b06)
OpenJDK 64-Bit Server VM (build 25.442-b06, mixed mode)
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff6df6f0b in os::abort(bool) [clone .cold] () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#6  0x00007ffff7757a6d in VMError::report_and_die() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#7  0x00007ffff759d0fd in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#8  0x00007ffff759024c in signalHandler(int, siginfo_t*, void*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#9  <signal handler called>
#10 0x0000000000000000 in ?? ()
#11 0x00007ffff6b64e67 in J9Ext::GetOSThreadID (thread=0x7ffff027b930) at /root/java-profiler/ddprof-lib/src/main/cpp/j9Ext.h:97
#12 VMThread::nativeThreadId (jni=jni@entry=0x7ffff026e260, thread=thread@entry=0x7ffff027b930) at /root/java-profiler/ddprof-lib/src/main/cpp/vmStructs.cpp:713
#13 0x00007ffff6b3a811 in Profiler::updateThreadName (this=this@entry=0x7ffff0005180, jvmti=jvmti@entry=0x7ffff0019930, jni=jni@entry=0x7ffff026e260, thread=thread@entry=0x7ffff027b930, self=self@entry=true) at /root/java-profiler/ddprof-lib/src/main/cpp/profiler.cpp:935
#14 0x00007ffff6b3a92e in Profiler::onThreadStart (this=0x7ffff0005180, jvmti=0x7ffff0019930, jni=0x7ffff026e260, thread=0x7ffff027b930) at /root/java-profiler/ddprof-lib/src/main/cpp/profiler.cpp:111
#15 0x00007ffff73e522e in JvmtiExport::post_thread_start(JavaThread*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#16 0x00007ffff732c0d8 in JNI_CreateJavaVM () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#17 0x00007ffff7f8b45a in JavaMain () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#18 0x00007ffff7f8f961 in call_continuation () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#19 0x00007ffff7c9caa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#20 0x00007ffff7d29c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

According to the stack trace, there must be something wrong with the VMStructs parsing, J9Ext should not be called on OpenJDK 8.

int VMThread::nativeThreadId(JNIEnv *jni, jthread thread) {
if (_has_native_thread_id) {
VMThread *vm_thread = fromJavaThread(jni, thread);
return vm_thread != NULL ? vm_thread->osThreadId() : -1;
}
return J9Ext::GetOSThreadID(thread);
}

Debugging with debug build of java-profiler, I found gHotSpotVMStructs can not be found, then VMStructs::initOffsets returned in line vmStructs.cpp:164.

void VMStructs::initOffsets() {
uintptr_t entry = readSymbol("gHotSpotVMStructs");
uintptr_t stride = readSymbol("gHotSpotVMStructEntryArrayStride");
uintptr_t type_offset = readSymbol("gHotSpotVMStructEntryTypeNameOffset");
uintptr_t field_offset = readSymbol("gHotSpotVMStructEntryFieldNameOffset");
uintptr_t offset_offset = readSymbol("gHotSpotVMStructEntryOffsetOffset");
uintptr_t address_offset = readSymbol("gHotSpotVMStructEntryAddressOffset");
if (entry == 0 || stride == 0) {
return;
}

After further debugging, I found some symbols are skipped in line symbols_linux.cpp:357.

if (_length == 0 || (sym->st_name < _length && sym->st_value < _length)) {

void ElfParser::loadSymbolTable(const char *symbols, size_t total_size,
size_t ent_size, const char *strings) {
for (const char *symbols_end = symbols + total_size; symbols < symbols_end;
symbols += ent_size) {
ElfSymbol *sym = (ElfSymbol *)symbols;
if (sym->st_name != 0 && sym->st_value != 0) {
// sanity check the offsets not to exceed the file size
if (_length == 0 || (sym->st_name < _length && sym->st_value < _length)) {
// Skip special AArch64 mapping symbols: $x and $d
if (sym->st_size != 0 || sym->st_info != 0 ||
strings[sym->st_name] != '$') {
_cc->add(_base + sym->st_value, (int)sym->st_size,
strings + sym->st_name);
}
}
}
}
}

In my case, the symbols are all stripped from libjvm.so, and stored in a separate file, which can be installed via apt-get install openjdk-8-dbg.
but line symbols_linux.cpp:357 compares the virtual address offset (i.e. sym->st_value, 0xdd82b8 = 14516920) with the debug file size (i.e. 2675232), and obviously, 14516920 is greater than 2675232, then symbol gHotSpotVMStructs is skipped.

I think the virtual address offset (sym->st_value,) should not be compared with debug file size (_length).

file /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=734da9c23b83138419928d48e1cc65c9b75facc3, stripped

ls -rtl /usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug
-rw-r--r-- 1 root root 2675232 Jan 26 15:38 /usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug

readelf -s -W /usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug|grep gHotSpotVMStructs
 41073: 0000000000dd82b8     8 OBJECT  GLOBAL DEFAULT   25 gHotSpotVMStructs

(gdb) p _length
$1 = 2675232

(gdb) p sym->st_name
$3 = 1685507

(gdb) p sym->st_value
$4 = 14516920

(gdb) p/x sym->st_value
$5 = 0xdd82b8

#0  ElfParser::loadSymbolTable (this=0x7ffff7bfd550, symbols=0x7ffff4af0d20 "\003\270\031", total_size=986328, ent_size=24, strings=0x7ffff4af0f60 "") at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:366
#1  0x00007ffff7a9be3d in ElfParser::loadSymbols (this=0x7ffff7bfd550, use_debug=false) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:258
#2  0x00007ffff7a9b4f8 in ElfParser::parseFile (cc=0x7ffff00d7250, base=0x7ffff6c00000 "\177ELF\002\001\001", file_name=0x7ffff7bfd5f0 "/usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug", use_debug=false)
    at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:85
#3  0x00007ffff7a9c125 in ElfParser::loadSymbolsUsingBuildId (this=0x7ffff7bfe6b0) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:304
#4  0x00007ffff7a9be65 in ElfParser::loadSymbols (this=0x7ffff7bfe6b0, use_debug=true) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:263
#5  0x00007ffff7a9b4f8 in ElfParser::parseFile (cc=0x7ffff00d7250, base=0x7ffff6c00000 "\177ELF\002\001\001", file_name=0x7ffff0028349 "/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so", use_debug=true)
    at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:85
#6  0x00007ffff7a9cee4 in parseLibrariesCallback (info=0x7ffff7bfe830, size=64, data=0x7ffff7af9e20 <Libraries::instance()::instance>) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:508
#7  0x00007ffff7d84002 in __GI___dl_iterate_phdr (callback=0x7ffff7a9cb6b <parseLibrariesCallback(dl_phdr_info*, size_t, void*)>, data=0x7ffff7af9e20 <Libraries::instance()::instance>) at ./elf/dl-iteratephdr.c:74
#8  0x00007ffff7a9d172 in Symbols::parseLibraries (array=0x7ffff7af9e20 <Libraries::instance()::instance>, kernel_symbols=false) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:550
#9  0x00007ffff7abdaa5 in Libraries::updateSymbols (this=0x7ffff7af9e20 <Libraries::instance()::instance>, kernel_symbols=false) at /root/java-profiler/ddprof-lib/src/main/cpp/libraries.cpp:36
#10 0x00007ffff7a990cc in VM::initShared (vm=0x7ffff79d16c0 <main_vm>) at /root/java-profiler/ddprof-lib/src/main/cpp/vmEntry.cpp:205
#11 0x00007ffff7a99593 in VM::initProfilerBridge (vm=0x7ffff79d16c0 <main_vm>, attach=false) at /root/java-profiler/ddprof-lib/src/main/cpp/vmEntry.cpp:302
#12 0x00007ffff7a9a282 in Agent_OnLoad (vm=0x7ffff79d16c0 <main_vm>, options=0x7ffff0003f40 "start,cpu=10ms,file=/tmp/ap.jfr", reserved=0x0) at /root/java-profiler/ddprof-lib/src/main/cpp/vmEntry.cpp:551
#13 0x00007ffff76f6674 in Threads::create_vm_init_agents() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#14 0x00007ffff76f92e2 in Threads::create_vm(JavaVMInitArgs*, bool*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#15 0x00007ffff732c010 in JNI_CreateJavaVM () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#16 0x00007ffff7f8b45a in JavaMain () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#17 0x00007ffff7f8f961 in call_continuation () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#18 0x00007ffff7c9caa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#19 0x00007ffff7d29c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

The test does not crash on Java 11, Java 17 and Java 21, but I think It's just a coincidence. The virtual address offset just happens to be smaller than the debug file size on JDK 11, JDK 17 and JDK 21.

I don't know why this check is added. If there is no real example, the simple fix is to remove this check, and I can submit a PR.

Thanks.

@jbachorik
Copy link
Collaborator

Hi @yanglong1010 - great report! Thanks!

I see the change was done as a part of #101, probably as a way to placate asan?
For my part, it should be ok to revert the check (the mainline async-profiler seems to be working fine without it) but /cc @r1viollet

@yanglong1010
Copy link
Contributor Author

Additional description:

My environment is

cat /etc/os-release

PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

The libjvm.so of the default JDK 8 installation (apt install openjdk-8-jdk) has no .symtab (i.e. stripped).
If the separate debug info package is not installed, the crash does not happen. Because java-profiler/async-profiler read symbols like gHotSpotVMStructs from DT_SYMTAB of the libjvm.so.

if (!_cc->hasDebugSymbols()) {
loadSymbolTable(symtab, syment * nsyms, syment, strtab);
}

This crash happens if the separate debug info package is installed (apt install openjdk-8-dbg).
After the debug file is parsed (and important symbols are skipped, like gHotSpotVMStructs ), the library is marked as containing debug symbols, and the the subsequent dynamic symbol resolution is also skipped.

_cc->setDebugSymbols(true);

if (!_cc->hasDebugSymbols()) {
loadSymbolTable(symtab, syment * nsyms, syment, strtab);
}

If I use other OpenJDK 8 builds like the bellow one, which has .symtab in the libjvm.so. Although it does not crash, it is just a coincidence too. The virtual address offset just happens to be smaller than the debug file size, just like the JDK 11, JDK 17 and JDK 21 I described above.

openjdk version "1.8.0_372"
OpenJDK Runtime Environment (build 1.8.0_372-b07)
OpenJDK 64-Bit Server VM (build 25.372-b07, mixed mode)

@r1viollet
Copy link
Collaborator

Thanks for the careful report. Agreed on the revert.

@yanglong1010
Copy link
Contributor Author

Thank you for your confirmation and review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants