Skip to content

rcache/base: do not free memory with the vma lock held #3013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 7, 2017

Conversation

hjelmn
Copy link
Member

@hjelmn hjelmn commented Feb 22, 2017

This commit makes the vma tree garbage collection list a lifo. This
way we can avoid having to hold any lock when releasing vmas. In
theory this should finally fix the hold-and-wait deadlock detailed
in #1654.

Signed-off-by: Nathan Hjelm [email protected]

This commit makes the vma tree garbage collection list a lifo. This
way we can avoid having to hold any lock when releasing vmas. In
theory this should finally fix the hold-and-wait deadlock detailed
in open-mpi#1654.

Signed-off-by: Nathan Hjelm <[email protected]>
@hjelmn
Copy link
Member Author

hjelmn commented Feb 22, 2017

@bosilca Please test.

@bosilca
Copy link
Member

bosilca commented Feb 23, 2017

We are not yet there. Here is a bt to prove it:

Thread 9 (Thread 0x7fffdb522700 (LWP 63715)):

#0  0x00007ffff65aa20e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00007ffff652f43b in _L_lock_10288 () from /lib64/libc.so.6
#2  0x00007ffff652cc63 in malloc () from /lib64/libc.so.6
#3  0x00007ffff03e315d in mca_rcache_base_vma_tree_insert () from /opt/ompi/master/gcc/lib/libopen-pal.so.0
#4  0x00007ffff03e1f49 in mca_rcache_base_vma_insert () from /opt/ompi/master/gcc/lib/libopen-pal.so.0
#5  0x00007fffdfdfcfe2 in mca_rcache_grdma_register () from /opt/ompi/master/gcc/lib/openmpi/mca_rcache_grdma.so
#6  0x00007fffdef97289 in mca_btl_openib_register_mem () from /opt/ompi/master/gcc/lib/openmpi/mca_btl_openib.so
#7  0x00007fffde7548d4 in mca_pml_ob1_rdma_btls () from /opt/ompi/master/gcc/lib/openmpi/mca_pml_ob1.so
#8  0x00007fffde751e4c in mca_pml_ob1_isend () from /opt/ompi/master/gcc/lib/openmpi/mca_pml_ob1.so
#9  0x00007ffff72ff87f in PMPI_Isend () from /opt/ompi/master/gcc/lib/libmpi.so.0

Thread 6 (Thread 0x7fffd9d1f700 (LWP 63718)):

#0  0x00007ffff7bcd334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ffff7bc85f3 in _L_lock_892 () from /lib64/libpthread.so.0
#2  0x00007ffff7bc84d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ffff03e2591 in mca_rcache_base_vma_tree_iterate () from /opt/ompi/master/gcc/lib/libopen-pal.so.0
#4  0x00007ffff03e3975 in mca_rcache_base_mem_cb () from /opt/ompi/master/gcc/lib/libopen-pal.so.0
#5  0x00007ffff035495a in opal_mem_hooks_release_hook () from /opt/ompi/master/gcc/lib/libopen-pal.so.0
#6  0x00007ffff03dcf0d in intercept_madvise () from /opt/ompi/master/gcc/lib/libopen-pal.so.0
#7  0x00007ffff652ac4b in _int_free () from /lib64/libc.so.6

There seems to be one left instance of allocating the memory with the VMA lock help.

@hjelmn
Copy link
Member Author

hjelmn commented Feb 23, 2017

@bosilca Ok, so now we have no free's while the lock is held. Now I need to figure out how to get to the point that there are no mallocs.... Will take some thought. A free list would reduce the probability but not eliminate it. And insert could end up allocating any number of vma structures... Should have a workaround soon.

@bosilca
Copy link
Member

bosilca commented Feb 23, 2017

@hjelmn why not simply releasing the lock before mca_rcache_base_vma_new. Yes there will be extra operations to reacquire the lock after, but at least the second time you will not need to allocate any memory.

@hjelmn
Copy link
Member Author

hjelmn commented Feb 23, 2017

@bosilca The problem is mca_rcache_base_vma_new inserts the vma into the rb tree. This in turn also allocates memory (&*$#@) and the red-black insert is not thread safe.

@hjelmn
Copy link
Member Author

hjelmn commented Feb 23, 2017

I think I have a way to fix this. Working out the solution now.

@hjelmn
Copy link
Member Author

hjelmn commented Feb 23, 2017

I think my fix will work. Will update this PR tomorrow morning. It will not be the cleanest solution but it will work until we can come up a better solution.

@hjelmn
Copy link
Member Author

hjelmn commented Mar 6, 2017

I think the quickest way around this bug is to disable the madvise hooks. From what I can tell glibc's free does not call munmap while holding any locks but it does call madvise while holding a lock.

@bosilca
Copy link
Member

bosilca commented Mar 6, 2017

Despite the lack of interest from the community this is a gigantic issue, that is now spreading across multiple version of OMPI. To be clear it is not about the thread-safety of OMPI, it is about jeopardizing any multi-threaded applications as soon as our memory interception is on.

@hjelmn
Copy link
Member Author

hjelmn commented Mar 6, 2017

@bosilca What do you think of removing the madvise hook as a workaround until we can handle the deadlock in a better way? You can try it by modifying opal/mca/memory/patcher/memory_patcher_module.c and just #if 0 out where we install the madvise hook.

@bosilca
Copy link
Member

bosilca commented Mar 6, 2017

Sounds reasonable. We will be running a test to see if this solves the issue.

@jsquyres
Copy link
Member

jsquyres commented Mar 6, 2017

@bosilca @hjelmn Is this an issue for v2.1.0?

@hjelmn
Copy link
Member Author

hjelmn commented Mar 6, 2017

@jsquyres Yes.

@jsquyres
Copy link
Member

jsquyres commented Mar 6, 2017

Changed the milestone to v2.1.0, because that's the next release / the one we're focusing on right now.

Is this a mallopocalypse? Do we need to block v2.1.0 for it? If so, what's the timeframe for a fix?

@bosilca
Copy link
Member

bosilca commented Mar 7, 2017

It is looking pretty bad indeed. We are testing @hjelmn solution to see if it fixes the problem.

@jsquyres
Copy link
Member

jsquyres commented Mar 7, 2017

@bosilca @hjelmn Got an ETA? We were hoping to release v2.1.0 today, but I guess that's not going to happen. ☹️

@jsquyres
Copy link
Member

jsquyres commented Mar 7, 2017

(BTW, Travis lies: if you click through, you can see that it finished successfully -- not sure why it says that it's still in progress)

@hjelmn hjelmn merged commit 15ea9c5 into open-mpi:master Mar 7, 2017
@bosilca
Copy link
Member

bosilca commented Mar 7, 2017

@dgenet has tested with madvise commented out and it seems to work. @hjelmn do you want to propose a patch ?

@jsquyres
Copy link
Member

jsquyres commented Mar 7, 2017

Per discussion on the webex today, we agreed that @hjelmn would both a) merge this PR and b) make another trivial PR that comments out the madvise hook (in anticipation of your testing being successful).

Looks like your test finished before @hjelmn made the next PR, but either way -- it's all good. Thanks!

@hjelmn
Copy link
Member Author

hjelmn commented Mar 7, 2017

@bosilca I will create the PRs now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants