-
Notifications
You must be signed in to change notification settings - Fork 901
Random SIGBUS error with xpmem on openmpi4.1.4 #11463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, Please let me know If I miss any info, I can collect those. --Arun |
Can you please supply all the information that was asked for in the github issue template for bug reporting? Thanks. |
Also, I kinda doubt that it will change anything, but we did just release Open MPI v4.1.5. Can you test with that version just to be complete? |
@arunedarath I built Open MPI 4.1.4
and ran 100 iterations like this:
I didn't see the error you are seeing. From your |
Correction: if you can, try the latest commit in the main xpmem repo (which is what I am using). |
Thanks for the valuable comments, I will try with the latest xpmem.. As this requires "root" permission (installing the xpmem.ko), I must ask my system admin. Please give me 1-2 days, I will be back with all the required info. --Arun |
I checked my kernel logs, it says 2.6.5. The admin might have mistakenly named it "/home/software/xpmem/2.3.0". arunchan@Milan004 ~]$ dmesg |grep -i xpmem |
The crash seems to be in |
I tried with openmpi 4.1.5, I get the same error. I will update it according to the "bug report template" Background informationI want to run openmpi (version 4.1.4) with xpmem using pml/ob1 and btl/vader What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)4.1.5 Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
If you are building/installing from a git clone, please copy-n-paste the output from
|
Hi, I tried openmpi_5_rc10 and did not see the issue (Ran 2000 iterations without sigbus error). for i in
There is a slight difference in the way I configured ompi5. I used the below command
For ompi4.1.x I used the below version,
That means my xpmem is perfectly fine and, the problem is with openmpi4.1.x version, isn't it? --Arun |
I did a bisect between v5.0.0rc1 (good) and v4.1.5 (bad), and got the below result.
|
Hi @arunedarath, since you are doing tests, could you try testing 5.0.0rc10, but with commit 7d3f868 reverted? You can apply the patch below, which reverts this commit and resolves the resulting conflict. It's untested though -- please let me know if it doesn't work. Patchdiff --git a/opal/mca/btl/sm/btl_sm_component.c b/opal/mca/btl/sm/btl_sm_component.c
index 9d73e1e39f..f6fccf2a61 100644
--- a/opal/mca/btl/sm/btl_sm_component.c
+++ b/opal/mca/btl/sm/btl_sm_component.c
@@ -253,6 +253,11 @@ static int mca_btl_sm_component_close(void)
OBJ_DESTRUCT(&mca_btl_sm_component.pending_endpoints);
OBJ_DESTRUCT(&mca_btl_sm_component.pending_fragments);
+ if (mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)
+ && NULL != mca_btl_sm_component.my_segment) {
+ munmap(mca_btl_sm_component.my_segment, mca_btl_sm_component.segment_size);
+ }
+
mca_btl_sm_component.my_segment = NULL;
if (mca_btl_sm_component.mpool) {
@@ -270,9 +275,14 @@ static int mca_btl_base_sm_modex_send(void)
modex_size = sizeof(modex) - sizeof(modex.seg_ds);
+ if (!mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)) {
modex.seg_ds_size = opal_shmem_sizeof_shmem_ds(&mca_btl_sm_component.seg_ds);
memmove(&modex.seg_ds, &mca_btl_sm_component.seg_ds, modex.seg_ds_size);
modex_size += modex.seg_ds_size;
+ } else {
+ modex.segment_base = (uintptr_t) mca_btl_sm_component.my_segment;
+ modex.seg_ds_size = 0;
+ }
int rc;
OPAL_MODEX_SEND(rc, PMIX_LOCAL, &mca_btl_sm_component.super.btl_version, &modex, modex_size);
@@ -365,31 +375,43 @@ mca_btl_sm_component_init(int *num_btls, bool enable_progress_threads, bool enab
mca_btl_sm.super.btl_put = NULL;
}
- char *sm_file;
-
- // Note: Use the node_rank not the local_rank for the backing file.
- // This makes the file unique even when recovering from failures.
- rc = opal_asprintf(&sm_file, "%s" OPAL_PATH_SEP "sm_segment.%s.%u.%x.%d",
- mca_btl_sm_component.backing_directory, opal_process_info.nodename,
- geteuid(), OPAL_PROC_MY_NAME.jobid, opal_process_info.my_node_rank);
- if (0 > rc) {
- free(btls);
- return NULL;
- }
- opal_pmix_register_cleanup(sm_file, false, false, false);
-
- rc = opal_shmem_segment_create(&component->seg_ds, sm_file, component->segment_size);
- free(sm_file);
- if (OPAL_SUCCESS != rc) {
- BTL_VERBOSE(("Could not create shared memory segment"));
- free(btls);
- return NULL;
- }
+ if (!mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)) {
+ char *sm_file;
+
+ // Note: Use the node_rank not the local_rank for the backing file.
+ // This makes the file unique even when recovering from failures.
+ rc = opal_asprintf(&sm_file, "%s" OPAL_PATH_SEP "sm_segment.%s.%u.%x.%d",
+ mca_btl_sm_component.backing_directory, opal_process_info.nodename,
+ geteuid(), OPAL_PROC_MY_NAME.jobid, opal_process_info.my_node_rank);
+ if (0 > rc) {
+ free(btls);
+ return NULL;
+ }
+ opal_pmix_register_cleanup(sm_file, false, false, false);
+
+ rc = opal_shmem_segment_create(&component->seg_ds, sm_file, component->segment_size);
+ free(sm_file);
+ if (OPAL_SUCCESS != rc) {
+ BTL_VERBOSE(("Could not create shared memory segment"));
+ free(btls);
+ return NULL;
+ }
- component->my_segment = opal_shmem_segment_attach(&component->seg_ds);
- if (NULL == component->my_segment) {
- BTL_VERBOSE(("Could not attach to just created shared memory segment"));
- goto failed;
+ component->my_segment = opal_shmem_segment_attach(&component->seg_ds);
+ if (NULL == component->my_segment) {
+ BTL_VERBOSE(("Could not attach to just created shared memory segment"));
+ goto failed;
+ }
+ } else {
+ /* if the shared-memory single-copy component can map memory (XPMEM) an anonymous segment
+ * can be used instead */
+ component->my_segment = mmap(NULL, component->segment_size, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+ if ((void *) -1 == component->my_segment) {
+ BTL_VERBOSE(("Could not create anonymous memory segment"));
+ free(btls);
+ return NULL;
+ }
}
/* initialize my fifo */
@@ -411,7 +433,11 @@ mca_btl_sm_component_init(int *num_btls, bool enable_progress_threads, bool enab
return btls;
failed:
- opal_shmem_unlink(&component->seg_ds);
+ if (mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)) {
+ munmap(component->my_segment, component->segment_size);
+ } else {
+ opal_shmem_unlink(&component->seg_ds);
+ }
if (btls) {
free(btls);
diff --git a/opal/mca/btl/sm/btl_sm_module.c b/opal/mca/btl/sm/btl_sm_module.c
index 7835742e4f..2ac02884f7 100644
--- a/opal/mca/btl/sm/btl_sm_module.c
+++ b/opal/mca/btl/sm/btl_sm_module.c
@@ -184,6 +184,12 @@ static int init_sm_endpoint(struct mca_btl_base_endpoint_t **ep_out, struct opal
mca_btl_sm.super.btl_put = NULL;
mca_btl_sm.super.btl_flags &= ~MCA_BTL_FLAGS_RDMA;
}
+ if (mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)) {
+ ep->smsc_map_context = MCA_SMSC_CALL(map_peer_region, ep->smsc_endpoint, /*flag=*/0,
+ (void *) (uintptr_t) modex->segment_base,
+ mca_btl_sm_component.segment_size,
+ (void **) &ep->segment_base);
+ } else {
/* store a copy of the segment information for detach */
ep->seg_ds = malloc(modex->seg_ds_size);
if (NULL == ep->seg_ds) {
@@ -196,6 +202,7 @@ static int init_sm_endpoint(struct mca_btl_base_endpoint_t **ep_out, struct opal
if (NULL == ep->segment_base) {
return OPAL_ERROR;
}
+ }
OBJ_CONSTRUCT(&ep->lock, opal_mutex_t);
@@ -345,8 +352,10 @@ static int sm_finalize(struct mca_btl_base_module_t *btl)
free(component->fbox_in_endpoints);
component->fbox_in_endpoints = NULL;
- opal_shmem_unlink(&mca_btl_sm_component.seg_ds);
- opal_shmem_segment_detach(&mca_btl_sm_component.seg_ds);
+ if (!mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)) {
+ opal_shmem_unlink(&mca_btl_sm_component.seg_ds);
+ opal_shmem_segment_detach(&mca_btl_sm_component.seg_ds);
+ }
return OPAL_SUCCESS;
}
@@ -511,18 +520,22 @@ static void mca_btl_sm_endpoint_destructor(mca_btl_sm_endpoint_t *ep)
OBJ_DESTRUCT(&ep->pending_frags);
OBJ_DESTRUCT(&ep->pending_frags_lock);
- if (ep->seg_ds) {
- opal_shmem_ds_t seg_ds;
-
- /* opal_shmem_segment_detach expects a opal_shmem_ds_t and will
- * stomp past the end of the seg_ds if it is too small (which
- * ep->seg_ds probably is) */
- memcpy(&seg_ds, ep->seg_ds, opal_shmem_sizeof_shmem_ds(ep->seg_ds));
- free(ep->seg_ds);
- ep->seg_ds = NULL;
-
- /* disconnect from the peer's segment */
- opal_shmem_segment_detach(&seg_ds);
+ if (!mca_smsc_base_has_feature(MCA_SMSC_FEATURE_CAN_MAP)) {
+ if (ep->seg_ds) {
+ opal_shmem_ds_t seg_ds;
+
+ /* opal_shmem_segment_detach expects a opal_shmem_ds_t and will
+ * stomp past the end of the seg_ds if it is too small (which
+ * ep->seg_ds probably is) */
+ memcpy(&seg_ds, ep->seg_ds, opal_shmem_sizeof_shmem_ds(ep->seg_ds));
+ free(ep->seg_ds);
+ ep->seg_ds = NULL;
+
+ /* disconnect from the peer's segment */
+ opal_shmem_segment_detach(&seg_ds);
+ }
+ } else if (NULL != ep->smsc_map_context) {
+ MCA_SMSC_CALL(unmap_peer_region, ep->smsc_map_context);
}
if (ep->fbox_out.fbox) { |
Hi George, I checked out v5.0.0rc10, applied the patch, run the application 2000 times, and did not see the issue. The suggested commit (7d3f868) came in Feb-3-2022, but for me, 'v5.0.0rc1' [2021-sep] is working perfectly fine. That means these two issues are unrelated right? The bisect log above shows this. I have a fundamental question regarding compiling openmpi5 from the git repo. I followed the below steps to compile v5.0.0rc10. a) git checkout v5.0.0rc10 Is checking out the tag v5.0.0rc10 in the main repo enough to select the corresponding commits from the submodules as well? [arunchan@Milan014 ompi]$ git branch
Are the steps right? or do I need to set manually the HEAD commits in submodules also? |
I see, thanks. So it initially doesn't seem to be the same issue as the one I linked earlier. The first bad commit does seem to be unrelated to btl/vader. But it's possible that one contibuting factor is the exit/finalize pattern of processes (which was also a factor in 7d3f868, if I understood the description correctly). Now for this reason we can't assume for sure that this is not the same issue as the one linked either, as rc10 might not contain the extra contributing factor (the finalize pattern). But let's take it from the top. Can you do a debug build with AFAIK you also need to do |
Running it with openmpi4.1.5 configured with --enable-debug does not seem to repro the issue. I am putting it for a longer run, let's see.. --Arun |
No luck. It ran for 4 hours without any issues with openmpi4.1.5 configured with --enable-debug. |
Hi, ompi 4.1.5 configured with "--enable-debug" is not showing this failure. How do you think we should proceed further? Are there other compile options/debug logs(Debug flags) to help find this issue's root cause? The decode of core without "--enable-debug" is below. [arunchan@Milan015 ob1_xpmem]$ gdb send_recv_group core.send_recv_group.1350.0363f02815784d5bbb22a0a51db0e3f3.575823.1679237339000000 For help, type "help". warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Section `.reg-xstate/575823' in core file too small. warning: Section `.reg-xstate/575823' in core file too small. --Arun |
@devreal Finally, xpmem in the lab is updated to the latest from https://github.com/hpc/xpmem. But there is no difference the same SIGBUS error (crash) is seen with that as well. --Arun |
Hi All, Ping,.. Do you have any ideas to debug this further? (Really annoying to see the SIGBUS error coming randomly while using xpmem :( ) --Arun |
FYI @hppritcha, might this be the same underlying issue as in #9868? |
Uh oh!
There was an error while loading. Please reload this page.
Hi Folks,
I am running the below MPI program,
It fails randomly.
The same program runs perfectly fine if I compile openmpi without xpmem.
How can I solve this problem? [I want the xpmem support to test performance of ob1]
ompi_info and the topology is attached.
--Arun
topology_ompi_info.txt
The text was updated successfully, but these errors were encountered: