-
Notifications
You must be signed in to change notification settings - Fork 909
smsc/cma: Add a check for CAP_SYS_PTRACE between processes #10694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I found this when working in a local Docker container launching a couple of processes. Without this commitshell$ docker run --rm --user 998:995 my-ompi-image mpirun -np 2 --mca smsc_base_verbose 1 /opt/hpc/local/nas/bin/lu.W.x
NAS Parallel Benchmarks 3.4 -- LU Benchmark
Size: 33x 33x 33 (class W)
Iterations: 300
Total number of processes: 2
lu.W.x: pml_ob1_sendreq.h:234: mca_pml_ob1_send_request_fini: Assertion `NULL == sendreq->rdma_frag' failed.
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x7fffa16fcd8f in ???
#1 0x7fffa16fb657 in ???
#2 0x7fffa20304d7 in ???
#3 0x7fffa12ca3f8 in ???
#4 0x7fffa12a4913 in ???
#5 0x7fffa12bdbef in ???
#6 0x7fffa12bdc93 in ???
#7 0x7fffa1cb3d87 in mca_pml_ob1_send_request_fini
at /working/ompi/ompi/mca/pml/ob1/pml_ob1_sendreq.h:234
#8 0x7fffa1cb5b2f in mca_pml_ob1_send
at /working/ompi/ompi/mca/pml/ob1/pml_ob1_isend.c:335
#9 0x7fffa1a8bb13 in PMPI_Send
at /working/ompi/ompi/mpi/c/send.c:93
#10 0x7fffa1f27aeb in ompi_send_f
at /working/ompi/ompi/mpi/fortran/mpif-h/profile/psend_f.c:78
#11 0x1001191f in ???
#12 0x10007363 in ???
#13 0x10001a43 in ???
#14 0x100017b3 in ???
#15 0x7fffa12aa877 in ???
#16 0x7fffa12aaa63 in ???
#17 0xffffffffffffffff in ???
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 0 on node c669c7922905 exited on signal 6 (Aborted).
-------------------------------------------------------------------------- Obviously, With this commitThis PR will allow the shell$ docker run --rm --user 998:995 my-ompi-image mpirun -np 2 --mca smsc_base_verbose 1 /opt/hpc/local/nas/bin/lu.W.x
[3303c3eb7c1b:00012] mca_smsc_cma_module_get_endpoint: can not proceed. processes do not have the necessary permissions (i.e., CAP_SYS_PTRACE). PID 12 <-> 11 (rc = -1) (errno: 1: Operation not permitted)
[3303c3eb7c1b:00011] mca_smsc_cma_module_get_endpoint: can not proceed. processes do not have the necessary permissions (i.e., CAP_SYS_PTRACE). PID 11 <-> 12 (rc = -1) (errno: 1: Operation not permitted)
NAS Parallel Benchmarks 3.4 -- LU Benchmark
Size: 33x 33x 33 (class W)
Iterations: 300
Total number of processes: 2
Time step 1
Time step 20
Time step 40
Time step 60
Time step 80
... If we add the necessary capabilities then all is golden: shell$ docker run --rm --cap-add SYS_PTRACE --user 998:995 my-ompi-image mpirun -np 2 --mca smsc_base_verbose 1 /opt/hpc/local/nas/bin/lu.W.x
NAS Parallel Benchmarks 3.4 -- LU Benchmark
Size: 33x 33x 33 (class W)
Iterations: 300
Total number of processes: 2
Time step 1
Time step 20
Time step 40
Time step 60
Time step 80
... |
I'm going to add a configure test for the |
db648af
to
d38ddad
Compare
Ok. It's ready for review now. |
* If you run in an environment (e.g., container) where the `CAP_SYS_PTRACE` capability is not provided then the two processes will not be able to use `process_vm_readv`/`process_vm_writev` even if all of the other checks currently in the code pass. The result is errors when trying to call one of these two functions which is difficult for the called (i.e., `btl/sm`) to recover from. * Use the `kcmp` system call as a proxy for the `process_vm_readv`/`process_vm_writev` functions. `kcmp` is a lightweight check in the kernel and is sufficient to detect if the two processes have the necessary capabilities. * Refs * Capabilities : https://man7.org/linux/man-pages/man7/capabilities.7.html * `kcmp` : https://man7.org/linux/man-pages/man2/kcmp.2.html Signed-off-by: Joshua Hursey <[email protected]>
fc86bf2
to
56132aa
Compare
CAP_SYS_PTRACE
capability is not provided then the two processes will not be able to
use
process_vm_readv
/process_vm_writev
even if all of the otherchecks currently in the code pass.
The result is errors when trying to call one of these two functions
which is difficult for the called (i.e.,
btl/sm
) to recover from.kcmp
system call as a proxy for theprocess_vm_readv
/process_vm_writev
functions.
kcmp
is a lightweight check in the kernel and is sufficientto detect if the two processes have the necessary capabilities.
kcmp
: https://man7.org/linux/man-pages/man2/kcmp.2.html