Skip to content

Conversation

popescu-v
Copy link
Collaborator

closes #209

@popescu-v popescu-v linked an issue Jul 31, 2024 that may be closed by this pull request
@popescu-v popescu-v self-assigned this Jul 31, 2024
@popescu-v popescu-v force-pushed the 209-openmpi-path-issue-with-khiops-1022-on-debian-like-os branch 2 times, most recently from 49c3e6b to 7c14b07 Compare July 31, 2024 16:18
Comment on lines +96 to +100
# an executable name under a /bin directory
# Note: this executable name can be different, depending on the MPI
# backend and OS; for instance, "orterun" for OpenMPI on Ubuntu Linux, but
# "mpiexec" for OpenMPI on Rocky Linux
kh-status | grep "MPI command" | grep -Ewq "(/.+?)/bin/.+"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is weaker than whaty is stated.

Maybe we should check grep -Ewq "orterun|mpirun|mpiexec" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I am not sure: what if the executable is called something else (e.g. mpiexec.hydra for MPICH, whence we would also add this, or rather mpiexec\.?.? etc)?

It soon becomes brittle IMHO. In the current state, we just check that the MPI command points to something that is inside a bin directory. The tests that mpiexec is found are elsewhere (in the unit-tests workflow).

So, I would keep it this way; the comments contain "for instance" and this is key IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I think though the first part should be (.+), and not (.+?)

Copy link
Collaborator Author

@popescu-v popescu-v Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to accommodate the case when the path starts with /bin (this is not excluded in principle I guess).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Khiops now supports it, following PR KhiopsML/khiops#241
which fixes issue KhiopsML/khiops#235.
Thus, mpiexec can be /bin/mpiexec through symlinks

related_to #209
Otherwise, in some situations, we get to /bin/mpiexec (which is
a symlink to the real path) which makes OpenMPI fail; see:
open-mpi/ompi#5613

closes #209
@popescu-v popescu-v force-pushed the 209-openmpi-path-issue-with-khiops-1022-on-debian-like-os branch from 7c14b07 to 33586e0 Compare July 31, 2024 17:08
@popescu-v popescu-v merged commit e86f0ee into dev Jul 31, 2024
20 checks passed
@popescu-v popescu-v deleted the 209-openmpi-path-issue-with-khiops-1022-on-debian-like-os branch July 31, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenMPI path issue with khiops 10.2.2 on Debian-like OS
2 participants