You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+41-25Lines changed: 41 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,11 +23,12 @@ with the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futu
23
23
[ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor) for parallel
24
24
execution of Python functions on a single computer. executorlib extends this functionality to distribute Python
25
25
functions over multiple computers within a high performance computing (HPC) cluster. This can be either achieved by
26
-
submitting each function as individual job to the HPC job scheduler - [HPC Submission Mode]() - or by requesting a
27
-
compute allocation of multiple nodes and then distribute the Python functions within this allocation - [HPC Allocation Mode]().
28
-
Finally, to accelerate the development process executorlib also provides a - [Local Mode]() - to use the executorlib
29
-
functionality on a single workstation for testing. Starting with the [Local Mode]() set by setting the backend parameter
30
-
to local - `backend="local"`:
26
+
submitting each function as individual job to the HPC job scheduler - [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html) -
27
+
or by requesting a compute allocation of multiple nodes and then distribute the Python functions within this - allocation -
28
+
[HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html). Finally, to accelerate the
29
+
development process executorlib also provides a - [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html) -
30
+
to use the executorlib functionality on a single workstation for testing. Starting with the [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html)
31
+
set by setting the backend parameter to local - `backend="local"`:
31
32
```python
32
33
from executorlib import Executor
33
34
@@ -60,8 +61,7 @@ Python function. In addition to the compute cores `cores`, the resource dictiona
60
61
as `threads_per_core`, the GPUs per core as `gpus_per_core`, the working directory with `cwd`, the option to use the
61
62
OpenMPI oversubscribe feature with `openmpi_oversubscribe` and finally for the [Simple Linux Utility for Resource
62
63
Management (SLURM)](https://slurm.schedmd.com) queuing system the option to provide additional command line arguments
63
-
with the `slurm_cmd_args` parameter - [resource dictionary]().
64
-
64
+
with the `slurm_cmd_args` parameter - [resource dictionary](https://executorlib.readthedocs.io/en/latest/trouble_shooting.html#resource-dictionary
65
65
This flexibility to assign computing resources on a per-function-call basis simplifies the up-scaling of Python programs.
66
66
Only the part of the Python functions which benefit from parallel execution are implemented as MPI parallel Python
67
67
funtions, while the rest of the program remains serial.
@@ -87,7 +87,7 @@ with Executor(backend="slurm_submission") as exe:
87
87
```
88
88
In this case the [Python simple queuing system adapter (pysqa)](https://pysqa.readthedocs.io) is used to submit the
89
89
`calc()` function to the [SLURM](https://slurm.schedmd.com) job scheduler and request an allocation with two CPU cores
90
-
for the execution of the function - [HPC Submission Mode](). In the background the [sbatch](https://slurm.schedmd.com/sbatch.html)
90
+
for the execution of the function - [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html). In the background the [sbatch](https://slurm.schedmd.com/sbatch.html)
91
91
command is used to request the allocation to execute the Python function.
92
92
93
93
Within a given [SLURM](https://slurm.schedmd.com) allocation executorlib can also be used to assign a subset of the
@@ -116,23 +116,39 @@ In addition, to support for [SLURM](https://slurm.schedmd.com) executorlib also
116
116
to address the needs for the up-coming generation of Exascale computers. Still even on traditional HPC clusters the
117
117
hierarchical approach of the [flux](http://flux-framework.org) is beneficial to distribute hundreds of tasks within a
118
118
given allocation. Even when [SLURM](https://slurm.schedmd.com) is used as primary job scheduler of your HPC, it is
119
-
recommended to use [SLURM with flux]() as hierarchical job scheduler within the allocations.
119
+
recommended to use [SLURM with flux](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html#slurm-with-flux)
120
+
as hierarchical job scheduler within the allocations.
Copy file name to clipboardExpand all lines: docs/installation.md
+17-16Lines changed: 17 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,12 +33,13 @@ used. The mpi4py documentation covers the [installation of mpi4py](https://mpi4p
33
33
in more detail.
34
34
35
35
## Caching
36
-
While the caching is an optional feature for [Local Mode] and for the distribution of Python functions in a given
37
-
allocation of an HPC job scheduler [HPC Allocation Mode], it is required for the submission of individual functions to
38
-
an HPC job scheduler [HPC Submission Mode]. This is required as in [HPC Submission Mode] the Python function is stored
39
-
on the file system until the requested computing resources become available. The caching is implemented based on the
40
-
hierarchical data format (HDF5). The corresponding [h5py](https://www.h5py.org) package can be installed using either
41
-
the [Python package manager](https://pypi.org/project/h5py/):
36
+
While the caching is an optional feature for [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html) and
37
+
for the distribution of Python functions in a given allocation of an HPC job scheduler [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html),
38
+
it is required for the submission of individual functions to an HPC job scheduler [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html).
39
+
This is required as in [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html) the
40
+
Python function is stored on the file system until the requested computing resources become available. The caching is
41
+
implemented based on the hierarchical data format (HDF5). The corresponding [h5py](https://www.h5py.org) package can be
42
+
installed using either the [Python package manager](https://pypi.org/project/h5py/):
42
43
```
43
44
pip install executorlib[cache]
44
45
```
@@ -67,17 +68,17 @@ documentation covers the [installation of pysqa](https://pysqa.readthedocs.io/en
67
68
detail.
68
69
69
70
## HPC Allocation Mode
70
-
For optimal performance in [HPC Allocation Mode] the [flux framework](https://flux-framework.org) is recommended as job
71
-
scheduler. Even when the [Simple Linux Utility for Resource Management (SLURM)](https://slurm.schedmd.com) or any other
72
-
job scheduler is already installed on the HPC cluster. [flux framework](https://flux-framework.org) can be installed as
73
-
a secondary job scheduler to leverage [flux framework](https://flux-framework.org) for the distribution of resources
74
-
within a given allocation of the primary scheduler.
71
+
For optimal performance in [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html) the
72
+
[flux framework](https://flux-framework.org) is recommended as job scheduler. Even when the [Simple Linux Utility for Resource Management (SLURM)](https://slurm.schedmd.com)
73
+
or any other job scheduler is already installed on the HPC cluster. [flux framework](https://flux-framework.org) can be
74
+
installed as a secondary job scheduler to leverage [flux framework](https://flux-framework.org) for the distribution of
75
+
resources within a given allocation of the primary scheduler.
75
76
76
-
The [flux framework](https://flux-framework.org) uses `libhwloc` and `pmi` to understand the hardware it is running on and to booststrap MPI.
77
-
`libhwloc` not only assigns CPU cores but also GPUs. This requires `libhwloc` to be compiled with support for GPUs from
78
-
your vendor. In the same way the version of `pmi` for your queuing system has to be compatible with the version
79
-
installed via conda. As `pmi` is typically distributed with the implementation of the Message Passing Interface (MPI),
80
-
it is required to install the compatible MPI library in your conda environment as well.
77
+
The [flux framework](https://flux-framework.org) uses `libhwloc` and `pmi` to understand the hardware it is running on
78
+
and to booststrap MPI. `libhwloc` not only assigns CPU cores but also GPUs. This requires `libhwloc` to be compiled with
79
+
support for GPUs from your vendor. In the same way the version of `pmi` for your queuing system has to be compatible
80
+
with the version installed via conda. As `pmi` is typically distributed with the implementation of the Message Passing
81
+
Interface (MPI), it is required to install the compatible MPI library in your conda environment as well.
81
82
82
83
### AMD GPUs with mpich / cray mpi
83
84
For example the [Frontier HPC](https://www.olcf.ornl.gov/frontier/) cluster at Oak Ridge National Laboratory uses
0 commit comments