-
Notifications
You must be signed in to change notification settings - Fork 900
Memory leak with MPI_Alloc_mem #4567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@hjelmn : As on the OpenMPI users' list. |
Does this machine have a high-performance interconnect? If so what kind? I can't reproduce the issue in a Linux VM with 16.04 or on macOS 10.12. |
@hjelmn i can reproduce this on my CentOS 7 VM (no high speed interconnect) with here is what i found
we end up inserting the
but
the item cannot be retrieved from the i made this patch that fixes one issue diff --git a/opal/mca/mpool/base/mpool_base_alloc.c b/opal/mca/mpool/base/mpool_base_alloc.c
index 77165a3..1c4f87a 100644
--- a/opal/mca/mpool/base/mpool_base_alloc.c
+++ b/opal/mca/mpool/base/mpool_base_alloc.c
@@ -14,6 +14,8 @@
* Copyright (c) 2010-2017 IBM Corporation. All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
+ * Copyright (c) 2017 Research Organization for Information Science
+ * and Technology (RIST). All rights reserved.
* $COPYRIGHT$
*
* Additional copyrights may follow
@@ -84,6 +86,7 @@ void *mca_mpool_base_alloc(size_t size, opal_info_t *info, const char *hints)
mca_mpool_base_tree_item_put (mpool_tree_item);
} else {
mpool_tree_item->mpool = mpool;
+ mpool_tree_item->key = mem;
mca_mpool_base_tree_insert (mpool_tree_item);
}
can you please review it before i commit and PR ? meanwhile, i will make sure there are no more memory leaks |
Thank you Gilles and Nathan. I haven't tested the fix in our full CFD code yet, but the minimal example now only shows a single memory allocation from MPI_Alloc_mem() when I apply the fix to both 2.1.2 and 3.0.0. |
Thanks @ggouaillardet. That fix looks familiar, has this come up before and not been actually patched? |
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. open-mpi#4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]>
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. open-mpi#4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit open-mpi/ompi@11e5f86)
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. open-mpi#4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit open-mpi/ompi@11e5f86)
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. open-mpi#4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit open-mpi/ompi@11e5f86)
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. open-mpi#4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit open-mpi/ompi@11e5f86)
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. #4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit 11e5f86)
set the key of all mpool_tree_item objects, so they can be retrieved in mpool_base_free and then returned back to the mca_mpool_base_tree_item_free_list free list. Refs. open-mpi#4567 Thanks Philip Blakely for the bug report. Signed-off-by: Gilles Gouaillardet <[email protected]>
Issue has been fixed on all active branches |
Background information
I have found what appears to be a memory leak within OpenMPI. This was seen in a CFD code we use in our research group, which uses the Boost MPI library. As part of its serialization of std::vectors, Boost makes use of MPI_Alloc_mem (balanced by MPI_Free_mem), and this appears to be the cause of a memory leak which eventually results in some of our simulations crashing due to lack of memory.
OpenMPI versions affected
v2.1.2 and v3.0.0 have the problem described below. v1.10.7 does not.
System and OpenMPI installation
Both the above versions were compiled from source
Platform: Ubuntu 16.04
Hardware: Intel 64-bit
Compiler: gcc-5.4.0 (default as on Ubuntu 16.04)
Configure flags:
./configure --prefix=/local/data/public/pmblakely/openmpi-3.0.0-install
Minimal example
The short program below exhibits the problem:
Compilation
/local/data/public/pmblakely/openmpi-3.0.0-install/bin/mpicc
./memory_leak_check_minimal.C -o ./memory_leak_check_minimal-3.0.0 -g
-O0
Testing
valgrind --tool=massif --threshold=0.1 --detailed-freq=1
./memory_leak_check_minimal-3.0.0
Then, ms_print --threshold=0.1 ./massif.out shows:
near the beginning, and
near the end of the test-run (the final mentions of
opal_free_list_grow_st show that the memory is eventually freed, but
probably by MPI_Finalize()).
I have tested the same program against MPICH 3.2.1 and it doesn't have the memory leak.
The text was updated successfully, but these errors were encountered: