Skip to content

Fix some minor compatibility issues #3104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 6, 2017
Merged

Fix some minor compatibility issues #3104

merged 1 commit into from
Mar 6, 2017

Conversation

rhc54
Copy link
Contributor

@rhc54 rhc54 commented Mar 5, 2017

Ensure job-level data gets stored against wildcard rank in the cray, s1, and s2 components, and that the ext1 component translates all wildcard rank requests into the peer's rank since v1.x of PMIx doesn't understand wildcard ranks

Closes #3101

Needs to come over to 2.1 since the change that exposed the issue was already taken there.

Signed-off-by: Ralph Castain [email protected]

…tored against wildcard rank in the cray, s1, and s2 components, and that the ext1 component translates all wildcard rank requests into the peer's rank since v1.x of PMIx doesn't understand wildcard ranks

Closes #3101

Signed-off-by: Ralph Castain <[email protected]>
@rhc54 rhc54 added this to the v2.1.0 milestone Mar 5, 2017
@rhc54 rhc54 requested review from hjelmn and artpol84 March 5, 2017 18:35
@artpol84
Copy link
Contributor

artpol84 commented Mar 5, 2017

@rhc54

Needs to come over to 2.1 since the change that exposed the issue was already taken there.

This is not correct. In 2.1 we use this proc:
https://github.com/open-mpi/ompi/blob/v2.x/ompi/proc/proc.c#L348:L349

@artpol84
Copy link
Contributor

artpol84 commented Mar 5, 2017

@karasevb, please verify s1 and s2 components at runtime with this PR.

@rhc54
Copy link
Contributor Author

rhc54 commented Mar 5, 2017

@artpol84 Kewl - I didn't know you had used the different name. However, just to be correct, that name should be changed to preserve the abstraction - it should be OMPI_PROC_MY_NAME

@artpol84
Copy link
Contributor

artpol84 commented Mar 5, 2017

Agreed

@artpol84
Copy link
Contributor

artpol84 commented Mar 6, 2017

@rhc54, we did runtime evaluation.

so it looks like s1 and s2 components was correct from the start with respect to local peers key. It was only cray component that was storing it in the old way.

Also it is funny, but pmix v1.2 doesn't care about wildcard to be passed on his input, the internal logic is written in the way that it will ignore that fact and work correctly. The change is good tough because otherwise we were breaking API and it was just a luck that pmix v1.2 is working.

@hppritcha
Copy link
Member

hppritcha commented Mar 6, 2017

@jsquyres I don't want to merge this till I understand what happened on 2/21/17. After that point many tests failed with direct launch using aprun. Even ring_c failed. This was on master.

@artpol84
Copy link
Contributor

artpol84 commented Mar 6, 2017

@hppritcha #3011 was merged on 02/28/2017 so it should be unrelated.
Why do you want to delay merging this? How can it help to investigate?

@hppritcha
Copy link
Member

given that cray/pmix specific code was changed in this PR, and I haven't tried it yet. that's why.

@hppritcha
Copy link
Member

sorry I thought this was for v2.x since that's what the target label was. don't care if it goes in to master since its already broken for aprun direct launch.

@rhc54 rhc54 removed the Target: v2.x label Mar 6, 2017
@rhc54 rhc54 merged commit 79540fe into open-mpi:master Mar 6, 2017
@rhc54 rhc54 deleted the topic/minor branch March 6, 2017 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants