-
Notifications
You must be signed in to change notification settings - Fork 901
rcache/base: do not free memory with the vma lock held #3013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit makes the vma tree garbage collection list a lifo. This way we can avoid having to hold any lock when releasing vmas. In theory this should finally fix the hold-and-wait deadlock detailed in open-mpi#1654. Signed-off-by: Nathan Hjelm <[email protected]>
@bosilca Please test. |
We are not yet there. Here is a bt to prove it: Thread 9 (Thread 0x7fffdb522700 (LWP 63715)):
Thread 6 (Thread 0x7fffd9d1f700 (LWP 63718)):
There seems to be one left instance of allocating the memory with the VMA lock help. |
@bosilca Ok, so now we have no free's while the lock is held. Now I need to figure out how to get to the point that there are no mallocs.... Will take some thought. A free list would reduce the probability but not eliminate it. And insert could end up allocating any number of vma structures... Should have a workaround soon. |
@hjelmn why not simply releasing the lock before mca_rcache_base_vma_new. Yes there will be extra operations to reacquire the lock after, but at least the second time you will not need to allocate any memory. |
@bosilca The problem is mca_rcache_base_vma_new inserts the vma into the rb tree. This in turn also allocates memory (&*$#@) and the red-black insert is not thread safe. |
I think I have a way to fix this. Working out the solution now. |
I think my fix will work. Will update this PR tomorrow morning. It will not be the cleanest solution but it will work until we can come up a better solution. |
I think the quickest way around this bug is to disable the madvise hooks. From what I can tell glibc's free does not call munmap while holding any locks but it does call madvise while holding a lock. |
Despite the lack of interest from the community this is a gigantic issue, that is now spreading across multiple version of OMPI. To be clear it is not about the thread-safety of OMPI, it is about jeopardizing any multi-threaded applications as soon as our memory interception is on. |
@bosilca What do you think of removing the madvise hook as a workaround until we can handle the deadlock in a better way? You can try it by modifying opal/mca/memory/patcher/memory_patcher_module.c and just #if 0 out where we install the madvise hook. |
Sounds reasonable. We will be running a test to see if this solves the issue. |
@jsquyres Yes. |
Changed the milestone to v2.1.0, because that's the next release / the one we're focusing on right now. Is this a mallopocalypse? Do we need to block v2.1.0 for it? If so, what's the timeframe for a fix? |
It is looking pretty bad indeed. We are testing @hjelmn solution to see if it fixes the problem. |
(BTW, Travis lies: if you click through, you can see that it finished successfully -- not sure why it says that it's still in progress) |
Per discussion on the webex today, we agreed that @hjelmn would both a) merge this PR and b) make another trivial PR that comments out the madvise hook (in anticipation of your testing being successful). Looks like your test finished before @hjelmn made the next PR, but either way -- it's all good. Thanks! |
@bosilca I will create the PRs now. |
This commit makes the vma tree garbage collection list a lifo. This
way we can avoid having to hold any lock when releasing vmas. In
theory this should finally fix the hold-and-wait deadlock detailed
in #1654.
Signed-off-by: Nathan Hjelm [email protected]