Extend the features supported by get_cache_execute_command() #783

jan-janssen · 2025-08-19T10:11:10Z

Summary by CodeRabbit

New Features
- Added two per-task execution options: "exclusive" and "OpenMPI oversubscribe" which map to Slurm's --exact and --oversubscribe when used with the Slurm backend; existing behavior for other backends and PMI handling is unchanged.
- These options are rejected for Flux and will raise a clear error if provided.
Documentation
- Parameter docs updated to describe the new options and defaults.
Tests
- Added tests verifying Slurm flag placement and that Flux rejects the new options.

coderabbitai · 2025-08-19T10:11:18Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds two boolean options (exclusive, openmpi_oversubscribe) to get_cache_execute_command in executorlib/standalone/command.py, updates the docstring, includes --exact and --oversubscribe in the srun command when set, and raises ValueError for these options with the Flux backend. Task scheduler components propagate and default these flags from resource dicts. Tests updated to exercise valid Slurm usage and invalid Flux usage.

Changes

Cohort / File(s)	Summary
Standalone command execution `executorlib/standalone/command.py`	Extended `get_cache_execute_command` signature with `exclusive: bool = False` and `openmpi_oversubscribe: bool = False`; updated docstring; Slurm path appends `--oversubscribe` when `openmpi_oversubscribe` and `--exact` when `exclusive`; Flux path rejects these options with `ValueError`; pmi_mode and other paths unchanged.
File-task execution wiring `executorlib/task_scheduler/file/shared.py`	Passes `exclusive` and `openmpi_oversubscribe` from per-task `task_resource_dict` (default False) into `get_cache_execute_command` when constructing per-task execute calls.
Scheduler defaults `executorlib/task_scheduler/file/task_scheduler.py`	Adds default resource_dict keys `exclusive: False` and `openmpi_oversubscribe: False` in `FileTaskScheduler.__init__`, merging them into provided resource dicts so flags are available by default.
Tests for standalone command `tests/test_standalone_command.py`	Added tests exercising `exclusive` and `openmpi_oversubscribe` for Slurm (verifies `--oversubscribe` and `--exact` ordering) and negative tests asserting `ValueError` when those flags are passed with `backend="flux"`; retains unknown-backend ValueError test.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Cmd as get_cache_execute_command
  participant Scheduler as FileTaskScheduler / shared

  Caller->>Scheduler: submit task (resource_dict may include flags)
  Scheduler->>Cmd: file_name, cores, backend, exclusive?, openmpi_oversubscribe?, pmi_mode?
  
  alt backend == "slurm"
    note right of Cmd #e6f9ea: Build srun command\n(include --oversubscribe / --exact when True)
    Cmd-->>Scheduler: ["srun","-n",...,"--mpi=...","--oversubscribe"?, "--exact"?, sys.executable, cache_parallel.py, file_name]
  else backend == "flux"
    note right of Cmd #fff0f0: Validate options\n(reject exclusive/openmpi_oversubscribe)
    Cmd-->>Scheduler: raise ValueError
  else
    note right of Cmd #f0f6ff: Existing paths\n(mpiexec / mpi4py / default)
    Cmd-->>Scheduler: Construct command (pmi_mode optional)
  end

  Scheduler-->>Caller: execute or propagate error

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Extend the features supported by get_cache_execute_command() #783 — Directly related: modifies get_cache_execute_command adding exclusive and openmpi_oversubscribe, updates Slurm/Flux handling and tests.
Refactor get_cache_execute_command() and get_interative_execute_command() #746 — Related: earlier refactor/introduction of get_cache_execute_command that this PR further modifies.

Poem

I’m a rabbit by the scheduler gate,
I tuck two flags in their Slurm-bound crate.
Flux says “no,” so I hop away light,
While srun accepts them into the night.
Little hops, big runs — options just right 🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 88a5fd4 and 99f8eec.

📒 Files selected for processing (2)

executorlib/task_scheduler/file/shared.py (1 hunks)
executorlib/task_scheduler/file/task_scheduler.py (1 hunks)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cache_command

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

for more information, see https://pre-commit.ci

codecov · 2025-08-19T10:15:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.67%. Comparing base (83653a3) to head (99f8eec).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #783      +/-   ##
==========================================
+ Coverage   97.66%   97.67%   +0.01%     
==========================================
  Files          33       33              
  Lines        1455     1463       +8     
==========================================
+ Hits         1421     1429       +8     
  Misses         34       34

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

executorlib/standalone/command.py (2)
54-55: “OpenMPI oversubscribe” via srun --oversubscribe is likely not what you intend

openmpi_oversubscribe suggests the Open MPI runtime flag. srun --oversubscribe is a Slurm scheduling option (node sharing), not the Open MPI oversubscription behavior. If you intend Open MPI oversubscription, prefer exporting OMPI_MCA_rmaps_base_oversubscribe=1 for the srun path. Also consider honoring this flag for the mpiexec path (Open MPI’s --oversubscribe).

Option A (recommended for Slurm/Open MPI): set the MCA var for Open MPI:
-            if openmpi_oversubscribe:
-                command_prepend += ["--oversubscribe"]
+            if openmpi_oversubscribe:
+                # Ensure Open MPI allows oversubscription under Slurm
+                command_prepend += ["--export=ALL,OMPI_MCA_rmaps_base_oversubscribe=1"]
Option B (if you truly want Slurm node sharing semantics): rename the flag or add a separate flag (e.g., slurm_allow_node_sharing) to avoid semantic confusion.

Additionally, for the non-Slurm, mpiexec-based path, consider appending --oversubscribe when openmpi_oversubscribe=True and Open MPI is detected.

To verify whether --oversubscribe is an Open MPI vs. Slurm option across your clusters and whether OMPI_MCA_rmaps_base_oversubscribe is honored, please confirm with your scheduler docs or cluster modules.

67-74: Flux path: error handling OK; consider documenting the limitation and future-proofing

Raising ValueError for exclusive and openmpi_oversubscribe under Flux is consistent, but please ensure the docstring reflects these limitations (see earlier comment). If Flux acquires equivalents (e.g., a way to request exclusive resources), a future extension point could be to map the flags instead of rejecting them.

I can add a short section to the README/docs explaining the backend-specific support matrix for these flags if helpful.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 83653a3 and d28b0ab.

📒 Files selected for processing (1)

executorlib/standalone/command.py (4 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_flux_openmpi
GitHub Check: unittest_win
GitHub Check: unittest_slurm_mpich
GitHub Check: notebooks_integration
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)

coderabbitai · 2025-08-19T10:16:46Z

executorlib/standalone/command.py

+    exclusive: bool = False,
+    openmpi_oversubscribe: bool = False,
    pmi_mode: Optional[str] = None,


💡 Verification agent

🧩 Analysis chain

Avoid breaking API: keep pmi_mode positional slot and make new flags keyword-only

Inserting exclusive and openmpi_oversubscribe before pmi_mode changes the positional argument order and will break callers that passed pmi_mode positionally. Preserve backward compatibility by keeping pmi_mode as the 4th parameter and making the new flags keyword-only.

Apply this diff to the function signature:

- exclusive: bool = False, - openmpi_oversubscribe: bool = False, - pmi_mode: Optional[str] = None, + pmi_mode: Optional[str] = None, + *, + exclusive: bool = False, + openmpi_oversubscribe: bool = False,

To locate potential breakages (calls with four positional args), run:

🏁 Script executed:

#!/bin/bash # Find calls using four positional args to get_cache_execute_command (likely broken by the reordering). ast-grep --pattern $'get_cache_execute_command($_, $_, $_, $_)'

Length of output: 915

Avoid breaking API: keep pmi_mode as the 4th positional argument and make new flags keyword-only

I ran an AST search—every existing call uses keyword arguments, so there’s no internal breakage. However, to preserve the public API for callers who may pass pmi_mode positionally, update the signature in executorlib/standalone/command.py as follows:

@@ def get_cache_execute_command( - exclusive: bool = False, - openmpi_oversubscribe: bool = False, - pmi_mode: Optional[str] = None, + pmi_mode: Optional[str] = None, + *, + exclusive: bool = False, + openmpi_oversubscribe: bool = False,

This keeps pmi_mode in its original (4th) slot and forces exclusive and openmpi_oversubscribe to be keyword-only.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

exclusive: bool = False,

openmpi_oversubscribe: bool = False,

pmi_mode: Optional[str] = None,

pmi_mode: Optional[str] = None,

*,

exclusive: bool = False,

openmpi_oversubscribe: bool = False,

🤖 Prompt for AI Agents

In executorlib/standalone/command.py around lines 24 to 26, the function signature currently lists exclusive, openmpi_oversubscribe, pmi_mode in that order; update it so pmi_mode remains the 4th positional parameter and make exclusive and openmpi_oversubscribe keyword-only: reorder the parameters so pmi_mode: Optional[str] = None appears before exclusive and openmpi_oversubscribe, and place a bare * before exclusive to force those two to be keyword-only, keeping types and defaults unchanged.

coderabbitai · 2025-08-19T10:16:46Z

executorlib/standalone/command.py

+        exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False.
+        openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False.
+        pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None


⚠️ Potential issue

Docstring fixes: correct typo and reflect backend limitations

Typo: “compute notes” → “compute nodes”.

Please document that exclusive and openmpi_oversubscribe are not supported with the Flux backend (you raise ValueError for both).

Apply this diff:

- exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False. - openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False. - pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None + pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix). Default is None. + exclusive (bool): Whether to exclusively reserve the compute nodes (no node sharing). Default: False. + Note: Not supported with the "flux" backend. + openmpi_oversubscribe (bool): Whether to oversubscribe the cores. Default: False. + Note: Not supported with the "flux" backend.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False.

openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False.

pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None

pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix). Default is None.

exclusive (bool): Whether to exclusively reserve the compute nodes (no node sharing). Default: False.

Note: Not supported with the "flux" backend.

openmpi_oversubscribe (bool): Whether to oversubscribe the cores. Default: False.

Note: Not supported with the "flux" backend.

🤖 Prompt for AI Agents

In executorlib/standalone/command.py around lines 35 to 37, fix the docstring typo "compute notes" to "compute nodes" and update the parameter descriptions to note backend limitations: state that both exclusive and openmpi_oversubscribe are not supported when using the Flux backend (they will raise ValueError), and keep the pmi_mode description as-is (default None). Ensure the updated docstring clearly mentions these Flux backend restrictions for exclusive and openmpi_oversubscribe.

coderabbitai · 2025-08-19T10:16:46Z

executorlib/standalone/command.py

+            if exclusive:
+                command_prepend += ["--exact"]


⚠️ Potential issue

Wrong Slurm flag for exclusive allocation — use --exclusive, not --exact

--exact controls task placement semantics, not node exclusivity. For exclusive node allocation with srun, the correct flag is --exclusive.

Apply this diff:

- if exclusive: - command_prepend += ["--exact"] + if exclusive: + command_prepend += ["--exclusive"]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if exclusive:

command_prepend += ["--exact"]

if exclusive:

command_prepend += ["--exclusive"]

🤖 Prompt for AI Agents

In executorlib/standalone/command.py around lines 56-57, the code appends the wrong Slurm flag "--exact" when requesting an exclusive allocation; change the appended flag to "--exclusive" so that command_prepend uses the correct srun option for node exclusivity (i.e., replace "--exact" with "--exclusive" in the exclusive branch).

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

tests/test_standalone_command.py (2)

54-63: Reduce assertion noise: compare the fixed prefix as a list.

Index-by-index checks are verbose and harder to maintain. Assert the fixed prefix in one shot, then keep the path-based checks for the final elements.

-        self.assertEqual(output[0], "srun")
-        self.assertEqual(output[1], "-n")
-        self.assertEqual(output[2], str(2))
-        self.assertEqual(output[3], "--mpi=pmi2")
-        self.assertEqual(output[4], "--oversubscribe")
-        self.assertEqual(output[5], "--exact")
+        # Assert fixed prefix in one go (adjust "--exact" to "--exclusive" if that is the intended behavior)
+        self.assertEqual(
+            output[:6],
+            ["srun", "-n", str(2), "--mpi=pmi2", "--oversubscribe", "--exact"],
+        )
         self.assertEqual(output[6], sys.executable)
         self.assertEqual(output[7].split(os.sep)[-1], "cache_parallel.py")
         self.assertEqual(output[8], file_name)

91-94: Also assert exception messages for Flux to lock in UX and avoid silent regressions.

Validating the message makes the tests stricter and ensures users get actionable errors.

-        with self.assertRaises(ValueError):
-            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", openmpi_oversubscribe=True)
-        with self.assertRaises(ValueError):
-            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", exclusive=True)
+        with self.assertRaisesRegex(ValueError, r"openmpi_oversubscribe.*flux backend"):
+            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", openmpi_oversubscribe=True)
+        with self.assertRaisesRegex(ValueError, r"exclusive.*flux backend"):
+            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", exclusive=True)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d28b0ab and 88a5fd4.

📒 Files selected for processing (1)

tests/test_standalone_command.py (2 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

tests/test_standalone_command.py (1)

executorlib/standalone/command.py (1)

get_cache_execute_command (20-89)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: unittest_old
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_flux_mpich
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
GitHub Check: unittest_slurm_mpich
GitHub Check: unittest_win
GitHub Check: unittest_flux_openmpi
GitHub Check: notebooks_integration

🔇 Additional comments (1)

tests/test_standalone_command.py (1)

54-63: No backward-incompatible positional calls found
An AST-based scan of the entire repo confirmed there are no calls to get_cache_execute_command with more than two positional arguments. All existing usages rely on keyword arguments for newer parameters—no further action needed.

coderabbitai · 2025-08-19T10:27:34Z

tests/test_standalone_command.py

+        output = get_cache_execute_command(cores=2, file_name=file_name, backend="slurm", pmi_mode="pmi2", openmpi_oversubscribe=True, exclusive=True)
        self.assertEqual(output[0], "srun")
        self.assertEqual(output[1], "-n")
        self.assertEqual(output[2], str(2))
        self.assertEqual(output[3], "--mpi=pmi2")
-        self.assertEqual(output[4], sys.executable)
-        self.assertEqual(output[5].split(os.sep)[-1], "cache_parallel.py")
-        self.assertEqual(output[6], file_name)
+        self.assertEqual(output[4], "--oversubscribe")
+        self.assertEqual(output[5], "--exact")
+        self.assertEqual(output[6], sys.executable)
+        self.assertEqual(output[7].split(os.sep)[-1], "cache_parallel.py")
+        self.assertEqual(output[8], file_name)


💡 Verification agent

🧩 Analysis chain

Confirm intent: exclusive currently asserts --exact, which likely should be Slurm's --exclusive.

Docstring semantics (“exclusively reserve the compute nodes”) map to Slurm’s --exclusive, not --exact. --exact has different semantics and does not enforce exclusive node allocation. If the intent is exclusive node allocation, both the implementation and the test should use --exclusive. If the intent is “exact task placement,” then the parameter name and docstring should be updated accordingly to avoid confusion.

Apply this test-only diff if exclusive nodes are intended:

- self.assertEqual(output[5], "--exact") + self.assertEqual(output[5], "--exclusive")

And update the Slurm path in executorlib/standalone/command.py accordingly:

# Replace the --exact flag with --exclusive for 'exclusive=True' if exclusive: command_prepend += ["--exclusive"]

To verify the correct Slurm flag for exclusive node allocation:

🌐 Web query:

In Slurm's srun, which option enforces exclusive node allocation: --exclusive or --exact? Also, what is the meaning of --exact?

💡 Result:

The correct option in Slurm's srun to enforce exclusive node allocation is --exclusive. The --exact option does something notably different.

Option purposes:

--exclusive (for srun, sbatch, or salloc): This option requests that the allocation consist of whole, exclusive nodes for the job or job step, meaning no other jobs can run on the same nodes during the allocation period—even if your job doesn't use all CPUs or resources available on those nodes[1][4]. This ensures your job has access to all resources of the assigned nodes and prevents sharing them with other users/jobs during your job's run.

--exact (for srun only): This option restricts each job step to exactly the amount of resources requested, allocating only the cores specified per job step[2][9][10]. Without --exact, job steps may get as many resources as are still available in the job allocation, which may result in one step grabbing all resources and later steps being blocked[2][10]. With --exact, each step receives precisely the number of CPUs/tasks you request, no more, no less; this is particularly used for fine control when launching multiple simultaneous steps within a single allocation.

Summary Table

Option Enforces exclusive node allocation? Purpose/Effect

--exclusive Yes Allocates whole nodes exclusively to your job; no sharing with other jobs.

--exact No Ensures each srun step gets exactly the requested resources (e.g., 1 CPU per step), no more.

References for option meanings:

[--exclusive documentation][1][4]

[--exact explanation][2][9][10]

In summary:
Use --exclusive for exclusive node allocations. The --exact option is used for controlling resource distribution for job steps within an already-allocated job, not for exclusive node allocation.

Citations:

1: https://slurm.schedmd.com/cons_tres.html

2: https://pgi-jcns.fz-juelich.de/portal/pages/slurm.html

3: https://cvw.cac.cornell.edu/slurm/basics/allocation

4: https://pawsey.atlassian.net/wiki/display/US/SLURM+upgrade+to+version+22.05.8

5: https://portal.cac.cornell.edu/techdocs/clusterinfo/slurm/

6: https://manpages.ubuntu.com/manpages/jammy/man1/srun.1.html

7: https://www.docs.arc.vt.edu/usage/more-slurm.html

8: https://docs.lxp.lu/first-steps/handling_jobs/

9: https://slurm.schedmd.com/srun.html

10: https://groups.oist.jp/scs/advanced-slurm

Replace --exact with Slurm’s --exclusive for true exclusive node allocation

Slurm’s --exclusive enforces whole-node exclusivity; --exact only limits per-step resource counts. To align intent and documentation, update both test and implementation:

• tests/test_standalone_command.py (around line 60):

- self.assertEqual(output[5], "--exact") + self.assertEqual(output[5], "--exclusive")

• executorlib/standalone/command.py (where exclusive=True is handled):

- if exclusive: - command_prepend += ["--exact"] + if exclusive: + command_prepend += ["--exclusive"]

🤖 Prompt for AI Agents

In tests/test_standalone_command.py around lines 54 to 63 and executorlib/standalone/command.py where exclusive=True is handled, the test and implementation currently add Slurm’s --exact but should use --exclusive to enforce whole-node exclusivity; update the implementation to emit "--exclusive" instead of "--exact" when exclusive=True and update the test expectation to assert "--exclusive" at the same index (replace the assertion for "--exact" with "--exclusive"), keeping all other arguments and ordering unchanged.

for more information, see https://pre-commit.ci

Extend the features supported by get_cache_execute_command()

8f761fc

[pre-commit.ci] auto fixes from pre-commit.com hooks

d28b0ab

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Aug 19, 2025

View reviewed changes

extend test coverage

88a5fd4

coderabbitai bot reviewed Aug 19, 2025

View reviewed changes

jan-janssen and others added 2 commits August 19, 2025 16:50

use the new functionality

a36c238

[pre-commit.ci] auto fixes from pre-commit.com hooks

99f8eec

for more information, see https://pre-commit.ci

jan-janssen merged commit b3cf4c8 into main Aug 19, 2025
34 checks passed

jan-janssen deleted the cache_command branch August 19, 2025 15:06

-        exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False.
-        openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False.
-        pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None
+        pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix). Default is None.
+        exclusive (bool): Whether to exclusively reserve the compute nodes (no node sharing). Default: False.
+                          Note: Not supported with the "flux" backend.
+        openmpi_oversubscribe (bool): Whether to oversubscribe the cores. Default: False.
+                                      Note: Not supported with the "flux" backend.

Option	Enforces exclusive node allocation?	Purpose/Effect
`--exclusive`	Yes	Allocates whole nodes exclusively to your job; no sharing with other jobs.
`--exact`	No	Ensures each `srun` step gets exactly the requested resources (e.g., 1 CPU per step), no more.

Extend the features supported by get_cache_execute_command() #783

Extend the features supported by get_cache_execute_command() #783

Uh oh!

Conversation

jan-janssen commented Aug 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

codecov bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jan-janssen commented Aug 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 19, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Aug 19, 2025 •

edited

Loading