-
Notifications
You must be signed in to change notification settings - Fork 900
pml/cm: fix a problem introduced with cuda support #8906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Patch should be fine, but shouldn't opal_convertor_prepare_for_send be able to handle the use case? |
looks like ompi jenkins is having problems cloning from github at the moment |
@jsquyres any idea why the call to opal_convertor_prepare_for_send breaks things here? looks like it adds a bunch of additional overhead we don't want in any case when not using CUDA. |
What is the call to convertor_prepare_for_send breaking ? What I see in the patch is that if the datatype is contiguous then the convertor will not be properly initialized is CUDA support is not active. This can be perceived as a shortcut (saving one function call), because the data to be sent is contiguous of size count*data.size. I assume whoever wrote that code took care to properly accounting for the potential use of a lower bound in the datatype. |
Jithin did some optimization work in #602 that introduced this bypass of opal_convertor_prepare_for_send or related code. See hpc#42 I was also seeing this in MTT but had not been looking into the issue at that point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well encapsulate the MCA_PML_SWITCH_CUDA_CONVERTOR_OFF with the if statement as well. Doesn't make sense to call it otherwise.
PR open-mpi#8536 introduced a regression in non-cuda environments when an application is using derived, but continguous datatypes. Related to open-mpi#8905. Signed-off-by: Howard Pritchard <[email protected]>
d15c0e8
to
9e99182
Compare
@wckzhang reworked per your suggestion |
PR #8536 introduced a regression in non-cuda environments
when an application is using derived, but continguous datatypes.
Related to #8905.
Signed-off-by: Howard Pritchard [email protected]