-
Notifications
You must be signed in to change notification settings - Fork 901
oshmem yoda spml call to add_procs seems wrong #2023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Actually, looking at Can we do something a a little less unexpected with the |
This is really only a partial fix. Per open-mpi#2023, the (oshmem_proc_t) and (ompi_proc_t) types should be unified properly (e.g., oshmem_proc_t should inherit an ompi_proc_t, or something like that). The fact that this code works by casting between two unrelated types is fragile and susceptible to break in the future. Signed-off-by: Jeff Squyres <[email protected]>
I was typing when you posted your last comment... I can easily silence the warning. struct ompi_proc_t {
opal_proc_t super;
/* endpoint data */
void *proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MAX];
char padding[16]; /* for future extensions (OSHMEM uses this area also)*/
}; and struct oshmem_proc_t {
opal_proc_t super;
/* endpoint data */
void *proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MAX];
/*
* All transport channels are globally ordered.
* pe(s) can talk to each other via subset of transports
* these holds indexes of each transport into global array
* proc -> id, where id can be btl id in yoda or mxm ptl id
* in ikrit
* spml is supposed to fill this during add_procs()
**/
int num_transports;
char *transport_ids;
}; the only difference is ompi has padding whereas oshmem has real data. bottom line, if an I have several options on top of my head
please share your thoughts and let me know if this should be added to the agenda of the next weekly meeting. |
I added the cast in #2024. @ggouaillardet Maybe another option is to have a common / shared type that both |
btw, |
@ggouaillardet Good point; I don't know if there's any code that assumes that the sizeof the two types are the same. It would probably be best to ensure that they are the same (e.g., perhaps an |
The more I think about it, it probably is important to guarantee that the sizes of the two types are identical. Otherwise, there wouldn't be any padding at the end of |
that makes sense to me. |
I'm not familiar enough with the OSHMEM code -- is it easy to make a PR for option 4? |
I will give it a try this week. |
@ggouaillardet Excellent; thanks. |
and use these macros to access oshmem related per proc data : - OSHMEM_PROC_NUM_TRANSPORTS(proc) - OSHMEM_PROC_TRANSPORT_IDS(proc) Fixes open-mpi#2023
store oshmem related per proc data in a oshmem_proc_data_t, that is written in the padding section of a ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi#2023
store oshmem related per proc data in an oshmem_proc_data_t struct, that is stored in the padding section of an ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi#2023
store oshmem related per proc data in an oshmem_proc_data_t struct, that is stored in the padding section of an ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi#2023
store oshmem related per proc data in an oshmem_proc_data_t struct, that is stored in the padding section of an ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi#2023
store oshmem related per proc data in an oshmem_proc_data_t struct, that is stored in the padding section of an ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi#2023
store oshmem related per proc data in an oshmem_proc_data_t struct, that is stored in the padding section of an ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi/ompi#2023 (back-ported from commit open-mpi/ompi@0a25420)
store oshmem related per proc data in an oshmem_proc_data_t struct, that is stored in the padding section of an ompi_proc_t this data can be accessed via the OSHMEM_PROC_DATA(proc) macro Fixes open-mpi#2023
When compiling master and v2.x:
I note that pml add_procs takes an
(ompi_proc_t **)
, but https://github.com/open-mpi/ompi/blob/master/oshmem/mca/spml/yoda/spml_yoda.c#L645 is calling with an(oshmem_proc_t **)
.This was added just 10 days ago by @ggouaillardet in 6b7bc64.
However,
oshmem_proc_t
is defined as (https://github.com/open-mpi/ompi/blob/master/oshmem/proc/proc.h#L44-L58):which does not contain an
(ompi_proc_t)
, thereby making the (implicit) cast from(oshmem_proc_t**)
to(ompi_proc_t**)
incorrect.@ggouaillardet Can you please have a look?
The text was updated successfully, but these errors were encountered: