-
-
Notifications
You must be signed in to change notification settings - Fork 33.5k
Description
- Version: all versions from 12.5.0 affected, also occurs on 12.7.0
- Platform: tested on macOS (18.7.0 Darwin xnu-4903.270.47~4/RELEASE_X86_64) and Linux
- Subsystem: something low-level
This is one of the weirdest bugs I've ever debugged.
Under certain circumstances, Node process (starting from version 12.5.0, previous versions not affected), will not exit (even if we call process.exit()
at the end of the file), and will start to consume memory until it fills the whole available memory, using 100% CPU while at it.
Here's a reproduction:
https://gist.github.com/niieani/a527e9b2e6caf28e3021a50b938e4a91
Now, what's really funny, there are a few things that make this reproduction not trigger the bug:
- making the strings in the
reproduction.json
shorter makes the process exit normally - removing various lines from the reproduction (comments show where) makes the process exit normally
- replacing
path.resolve
with a call to a different function, e.g. byconsole.log
(note that this originally usedjest-haste-map/build/lib/fast_path
and would still trigger the bug)
Standard memory or CPU profiling doesn't show anything (literally, there are no function calls shown in the profiler). Dumping the heap snapshot results in a 36MB file, even when the memory consumption of the process is 10 GB.
The backstory is that in our case jest
would not end AFTER all the tests have completed, and instead would start consuming huge amounts of memory (on CI, multiple processes from separate PRs would deplete the 128GB of RAM on the machine) so I started going down the rabbit hole to try and pinpoint what is the cause of the issue, which lead me to this minimal reproduction scenario.
I've done a core dump, so here's what I got from lldb
from a slightly larger (earlier version) of my reproduction:
core dump (lldb)
(lldb) bt all
* thread #1, stop reason = signal SIGSTOP
* frame #0: 0x00007fff5bddd86a libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff5be9c56e libsystem_pthread.dylib`_pthread_cond_wait + 722
frame #2: 0x0000000100682ae6 node`uv_cond_wait + 9
frame #3: 0x00000001000cbd19 node`node::TaskQueue<v8::Task>::BlockingDrain() + 39
frame #4: 0x00000001000ccf6b node`node::NodePlatform::DrainTasks(v8::Isolate*) + 47
frame #5: 0x00000001000af530 node`node::NodeMainInstance::Run() + 394
frame #6: 0x000000010005a28a node`node::Start(int, char**) + 245
frame #7: 0x00007fff5bca53d5 libdyld.dylib`start + 1
frame #8: 0x00007fff5bca53d5 libdyld.dylib`start + 1
thread #2, stop reason = signal SIGSTOP
frame #0: 0x00007fff5bde078e libsystem_kernel.dylib`kevent + 10
frame #1: 0x00000001006872c2 node`uv__io_poll + 841
frame #2: 0x0000000100677c27 node`uv_run + 320
frame #3: 0x00000001000cd437 node`node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Run() + 305
frame #4: 0x00007fff5be992eb libsystem_pthread.dylib`_pthread_body + 126
frame #5: 0x00007fff5be9c249 libsystem_pthread.dylib`_pthread_start + 66
frame #6: 0x00007fff5be9840d libsystem_pthread.dylib`thread_start + 13
thread #3, stop reason = signal SIGSTOP
frame #0: 0x0000000100b17bc7 node`v8::internal::compiler::TypeNarrowingReducer::Reduce(v8::internal::compiler::Node*) + 377
frame #1: 0x0000000100ad2fe8 node`v8::internal::compiler::(anonymous namespace)::SourcePositionWrapper::Reduce(v8::internal::compiler::Node*) + 60
frame #2: 0x0000000100a349b9 node`v8::internal::compiler::GraphReducer::Reduce(v8::internal::compiler::Node*) + 83
frame #3: 0x0000000100a347af node`v8::internal::compiler::GraphReducer::ReduceTop() + 293
frame #4: 0x0000000100a345bb node`v8::internal::compiler::GraphReducer::ReduceNode(v8::internal::compiler::Node*) + 83
frame #5: 0x0000000100ad4a85 node`v8::internal::compiler::LoadEliminationPhase::Run(v8::internal::compiler::PipelineData*, v8::internal::Zone*) + 569
frame #6: 0x0000000100acc567 node`void v8::internal::compiler::PipelineImpl::Run<v8::internal::compiler::LoadEliminationPhase>() + 121
frame #7: 0x0000000100aca405 node`v8::internal::compiler::PipelineImpl::OptimizeGraph(v8::internal::compiler::Linkage*) + 219
frame #8: 0x0000000100aca2bf node`v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl() + 381
frame #9: 0x000000010021a580 node`v8::internal::OptimizedCompilationJob::ExecuteJob() + 30
frame #10: 0x00000001002190fd node`v8::internal::OptimizingCompileDispatcher::CompileNext(v8::internal::OptimizedCompilationJob*) + 31
frame #11: 0x0000000100219d95 node`v8::internal::OptimizingCompileDispatcher::CompileTask::RunInternal() + 395
frame #12: 0x00000001000cbb30 node`node::(anonymous namespace)::PlatformWorkerThread(void*) + 301
frame #13: 0x00007fff5be992eb libsystem_pthread.dylib`_pthread_body + 126
frame #14: 0x00007fff5be9c249 libsystem_pthread.dylib`_pthread_start + 66
frame #15: 0x00007fff5be9840d libsystem_pthread.dylib`thread_start + 13
thread #4, stop reason = signal SIGSTOP
frame #0: 0x00007fff5bddd86a libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff5be9c56e libsystem_pthread.dylib`_pthread_cond_wait + 722
frame #2: 0x0000000100682ae6 node`uv_cond_wait + 9
frame #3: 0x00000001000cd526 node`node::TaskQueue<v8::Task>::BlockingPop() + 62
frame #4: 0x00000001000cbb56 node`node::(anonymous namespace)::PlatformWorkerThread(void*) + 339
frame #5: 0x00007fff5be992eb libsystem_pthread.dylib`_pthread_body + 126
frame #6: 0x00007fff5be9c249 libsystem_pthread.dylib`_pthread_start + 66
frame #7: 0x00007fff5be9840d libsystem_pthread.dylib`thread_start + 13
thread #5, stop reason = signal SIGSTOP
frame #0: 0x00007fff5bddd86a libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff5be9c56e libsystem_pthread.dylib`_pthread_cond_wait + 722
frame #2: 0x0000000100682ae6 node`uv_cond_wait + 9
frame #3: 0x00000001000cd526 node`node::TaskQueue<v8::Task>::BlockingPop() + 62
frame #4: 0x00000001000cbb56 node`node::(anonymous namespace)::PlatformWorkerThread(void*) + 339
frame #5: 0x00007fff5be992eb libsystem_pthread.dylib`_pthread_body + 126
frame #6: 0x00007fff5be9c249 libsystem_pthread.dylib`_pthread_start + 66
frame #7: 0x00007fff5be9840d libsystem_pthread.dylib`thread_start + 13
thread #6, stop reason = signal SIGSTOP
frame #0: 0x00007fff5bddd86a libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff5be9c56e libsystem_pthread.dylib`_pthread_cond_wait + 722
frame #2: 0x0000000100682ae6 node`uv_cond_wait + 9
frame #3: 0x00000001000cd526 node`node::TaskQueue<v8::Task>::BlockingPop() + 62
frame #4: 0x00000001000cbb56 node`node::(anonymous namespace)::PlatformWorkerThread(void*) + 339
frame #5: 0x00007fff5be992eb libsystem_pthread.dylib`_pthread_body + 126
frame #6: 0x00007fff5be9c249 libsystem_pthread.dylib`_pthread_start + 66
frame #7: 0x00007fff5be9840d libsystem_pthread.dylib`thread_start + 13
thread #7, stop reason = signal SIGSTOP
frame #0: 0x00007fff5bdda266 libsystem_kernel.dylib`semaphore_wait_trap + 10
frame #1: 0x0000000100682fb5 node`uv_sem_wait + 16
frame #2: 0x000000010010e006 node`node::inspector::(anonymous namespace)::StartIoThreadMain(void*) + 18
frame #3: 0x00007fff5be992eb libsystem_pthread.dylib`_pthread_body + 126
frame #4: 0x00007fff5be9c249 libsystem_pthread.dylib`_pthread_start + 66
frame #5: 0x00007fff5be9840d libsystem_pthread.dylib`thread_start + 13
I couldn't get llnode
to work, so this is all I could get from the above.
The integrated report also wasn't very helpful, but here's the relevant part (report from an even earlier reproduction while running in jest
):
Node Report
{
"javascriptStack": {
"message": "No stack.",
"stack": [
"Unavailable."
]
},
"nativeStack": [
{
"pc": "0x0000000100147402",
"symbol": "report::TriggerNodeReport(v8::Isolate*, node::Environment*, char const*, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, v8::Local<v8::String>) [/usr/local/bin/node]"
},
{
"pc": "0x000000010014df5c",
"symbol": "report::WriteReport(v8::FunctionCallbackInfo<v8::Value> const&) [/usr/local/bin/node]"
},
{
"pc": "0x00000001001cf90c",
"symbol": "v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/usr/local/bin/node]"
},
{
"pc": "0x00000001001cefd6",
"symbol": "v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/usr/local/bin/node]"
},
{
"pc": "0x00000001001ce7be",
"symbol": "v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/usr/local/bin/node]"
},
{
"pc": "0x000000010071ccd9",
"symbol": "Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node]"
}
],
"javascriptHeap": {
"totalMemory": 43360256,
"totalCommittedMemory": 41433760,
"usedMemory": 38031208,
"availableMemory": 2160634808,
"memoryLimit": 2197815296,
"heapSpaces": {
"read_only_space": {
"memorySize": 262144,
"committedMemory": 32272,
"capacity": 261832,
"used": 31960,
"available": 229872
},
"new_space": {
"memorySize": 1048576,
"committedMemory": 4520,
"capacity": 1047328,
"used": 3272,
"available": 1044056
},
"old_space": {
"memorySize": 35008512,
"committedMemory": 34752088,
"capacity": 34943712,
"used": 32316584,
"available": 2627128
},
"code_space": {
"memorySize": 1212416,
"committedMemory": 1072384,
"capacity": 1150976,
"used": 909120,
"available": 241856
},
"map_space": {
"memorySize": 2363392,
"committedMemory": 2107280,
"capacity": 2359128,
"used": 1369600,
"available": 989528
},
"large_object_space": {
"memorySize": 3416064,
"committedMemory": 3416064,
"capacity": 3397216,
"used": 3397216,
"available": 0
},
"code_large_object_space": {
"memorySize": 49152,
"committedMemory": 49152,
"capacity": 3456,
"used": 3456,
"available": 0
},
"new_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 1047328,
"used": 0,
"available": 1047328
}
}
},
"resourceUsage": {
"userCpuSeconds": 17.6539,
"kernelCpuSeconds": 1.12431,
"cpuConsumptionPercent": 98.8324,
"maxRss": 1796776919040,
"pageFaults": {
"IORequired": 2,
"IONotRequired": 474408
},
"fsActivity": {
"reads": 0,
"writes": 0
}
},
"libuv": [
{
"type": "async",
"is_active": true,
"is_referenced": false,
"address": "0x0000000104102f70"
},
{
"type": "timer",
"is_active": true,
"is_referenced": true,
"address": "0x0000000105843328",
"repeat": 0,
"firesInMsFromNow": 616,
"expired": false
},
{
"type": "check",
"is_active": true,
"is_referenced": false,
"address": "0x00000001058433c0"
},
{
"type": "idle",
"is_active": false,
"is_referenced": true,
"address": "0x0000000105843438"
},
{
"type": "prepare",
"is_active": false,
"is_referenced": false,
"address": "0x00000001058434b0"
},
{
"type": "check",
"is_active": false,
"is_referenced": false,
"address": "0x0000000105843528"
},
{
"type": "async",
"is_active": true,
"is_referenced": false,
"address": "0x000000010422b840"
},
{
"type": "async",
"is_active": true,
"is_referenced": false,
"address": "0x0000000101128958"
},
{
"type": "signal",
"is_active": true,
"is_referenced": false,
"address": "0x0000000103f05970",
"signum": 31,
"signal": "SIGUSR2"
},
{
"type": "tty",
"is_active": false,
"is_referenced": true,
"address": "0x0000000104231848",
"width": 178,
"height": 35,
"fd": 20,
"writeQueueSize": 0,
"readable": true,
"writable": true
},
{
"type": "signal",
"is_active": true,
"is_referenced": false,
"address": "0x0000000104303240",
"signum": 28,
"signal": "SIGWINCH"
},
{
"type": "tty",
"is_active": false,
"is_referenced": true,
"address": "0x0000000104305568",
"width": 178,
"height": 35,
"fd": 22,
"writeQueueSize": 0,
"readable": true,
"writable": true
},
{
"type": "loop",
"is_active": true,
"address": "0x000000010112e690"
}
]
}