Skip to content

Conversation

@yosefe
Copy link
Contributor

@yosefe yosefe commented Jan 1, 2019

@yosefe
Copy link
Contributor Author

yosefe commented Jan 1, 2019

@hoopoepg pls take a look

@yosefe yosefe added the bug label Jan 1, 2019
Fixes scoll_basic failures with shmem_verifier, caused by recent changes
in handling of zero-size collectives.

- Check for zero-size length only for fixed size collect (shmem_fcollect),
  but not for variable-size collect (shmem_collect)
- Add 'nlong_type' parameter to internal broadcast function, to indicate
  whether the 'nlong' parameter is valid on non-root PEs, since it's
  used by shmem_collect algorithm. Before this change, some components
  assumed it's true (scoll_mpi) while others assumed it's false
  (scoll_basic).
- In scoll_basic, if nlong_type==false, do not exit if nlong==0, since
  this parameter may not be the same on all PEs.
- In scoll_mpi, fallback to scoll_basic if nlong_type==false, since MPI
  requires the 'count' argument of MPI_Bcast to be valid on all ranks.

Signed-off-by: Yossi Itigin <[email protected]>
@yosefe yosefe force-pushed the topic/scoll-basic-fix-zero-size-collect branch from 628fd3b to 939162e Compare January 1, 2019 18:43
@yosefe yosefe merged commit 00fbb4c into open-mpi:master Jan 2, 2019
@yosefe yosefe deleted the topic/scoll-basic-fix-zero-size-collect branch January 2, 2019 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants