Skip to content

Conversation

@jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Aug 19, 2025

Summary by CodeRabbit

  • New Features

    • Added two per-task execution options: "exclusive" and "OpenMPI oversubscribe" which map to Slurm's --exact and --oversubscribe when used with the Slurm backend; existing behavior for other backends and PMI handling is unchanged.
    • These options are rejected for Flux and will raise a clear error if provided.
  • Documentation

    • Parameter docs updated to describe the new options and defaults.
  • Tests

    • Added tests verifying Slurm flag placement and that Flux rejects the new options.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 19, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds two boolean options (exclusive, openmpi_oversubscribe) to get_cache_execute_command in executorlib/standalone/command.py, updates the docstring, includes --exact and --oversubscribe in the srun command when set, and raises ValueError for these options with the Flux backend. Task scheduler components propagate and default these flags from resource dicts. Tests updated to exercise valid Slurm usage and invalid Flux usage.

Changes

Cohort / File(s) Summary
Standalone command execution
executorlib/standalone/command.py
Extended get_cache_execute_command signature with exclusive: bool = False and openmpi_oversubscribe: bool = False; updated docstring; Slurm path appends --oversubscribe when openmpi_oversubscribe and --exact when exclusive; Flux path rejects these options with ValueError; pmi_mode and other paths unchanged.
File-task execution wiring
executorlib/task_scheduler/file/shared.py
Passes exclusive and openmpi_oversubscribe from per-task task_resource_dict (default False) into get_cache_execute_command when constructing per-task execute calls.
Scheduler defaults
executorlib/task_scheduler/file/task_scheduler.py
Adds default resource_dict keys exclusive: False and openmpi_oversubscribe: False in FileTaskScheduler.__init__, merging them into provided resource dicts so flags are available by default.
Tests for standalone command
tests/test_standalone_command.py
Added tests exercising exclusive and openmpi_oversubscribe for Slurm (verifies --oversubscribe and --exact ordering) and negative tests asserting ValueError when those flags are passed with backend="flux"; retains unknown-backend ValueError test.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Cmd as get_cache_execute_command
  participant Scheduler as FileTaskScheduler / shared

  Caller->>Scheduler: submit task (resource_dict may include flags)
  Scheduler->>Cmd: file_name, cores, backend, exclusive?, openmpi_oversubscribe?, pmi_mode?
  
  alt backend == "slurm"
    note right of Cmd #e6f9ea: Build srun command\n(include --oversubscribe / --exact when True)
    Cmd-->>Scheduler: ["srun","-n",...,"--mpi=...","--oversubscribe"?, "--exact"?, sys.executable, cache_parallel.py, file_name]
  else backend == "flux"
    note right of Cmd #fff0f0: Validate options\n(reject exclusive/openmpi_oversubscribe)
    Cmd-->>Scheduler: raise ValueError
  else
    note right of Cmd #f0f6ff: Existing paths\n(mpiexec / mpi4py / default)
    Cmd-->>Scheduler: Construct command (pmi_mode optional)
  end

  Scheduler-->>Caller: execute or propagate error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

I’m a rabbit by the scheduler gate,
I tuck two flags in their Slurm-bound crate.
Flux says “no,” so I hop away light,
While srun accepts them into the night.
Little hops, big runs — options just right 🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 88a5fd4 and 99f8eec.

📒 Files selected for processing (2)
  • executorlib/task_scheduler/file/shared.py (1 hunks)
  • executorlib/task_scheduler/file/task_scheduler.py (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cache_command

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@codecov
Copy link

codecov bot commented Aug 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.67%. Comparing base (83653a3) to head (99f8eec).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #783      +/-   ##
==========================================
+ Coverage   97.66%   97.67%   +0.01%     
==========================================
  Files          33       33              
  Lines        1455     1463       +8     
==========================================
+ Hits         1421     1429       +8     
  Misses         34       34              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
executorlib/standalone/command.py (2)

54-55: “OpenMPI oversubscribe” via srun --oversubscribe is likely not what you intend

openmpi_oversubscribe suggests the Open MPI runtime flag. srun --oversubscribe is a Slurm scheduling option (node sharing), not the Open MPI oversubscription behavior. If you intend Open MPI oversubscription, prefer exporting OMPI_MCA_rmaps_base_oversubscribe=1 for the srun path. Also consider honoring this flag for the mpiexec path (Open MPI’s --oversubscribe).

Option A (recommended for Slurm/Open MPI): set the MCA var for Open MPI:

-            if openmpi_oversubscribe:
-                command_prepend += ["--oversubscribe"]
+            if openmpi_oversubscribe:
+                # Ensure Open MPI allows oversubscription under Slurm
+                command_prepend += ["--export=ALL,OMPI_MCA_rmaps_base_oversubscribe=1"]

Option B (if you truly want Slurm node sharing semantics): rename the flag or add a separate flag (e.g., slurm_allow_node_sharing) to avoid semantic confusion.

Additionally, for the non-Slurm, mpiexec-based path, consider appending --oversubscribe when openmpi_oversubscribe=True and Open MPI is detected.

To verify whether --oversubscribe is an Open MPI vs. Slurm option across your clusters and whether OMPI_MCA_rmaps_base_oversubscribe is honored, please confirm with your scheduler docs or cluster modules.


67-74: Flux path: error handling OK; consider documenting the limitation and future-proofing

Raising ValueError for exclusive and openmpi_oversubscribe under Flux is consistent, but please ensure the docstring reflects these limitations (see earlier comment). If Flux acquires equivalents (e.g., a way to request exclusive resources), a future extension point could be to map the flags instead of rejecting them.

I can add a short section to the README/docs explaining the backend-specific support matrix for these flags if helpful.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 83653a3 and d28b0ab.

📒 Files selected for processing (1)
  • executorlib/standalone/command.py (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
  • GitHub Check: unittest_mpich (macos-latest, 3.13)
  • GitHub Check: unittest_openmpi (macos-latest, 3.13)
  • GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
  • GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
  • GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
  • GitHub Check: unittest_flux_openmpi
  • GitHub Check: unittest_win
  • GitHub Check: unittest_slurm_mpich
  • GitHub Check: notebooks_integration
  • GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
  • GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)

Comment on lines +24 to 26
exclusive: bool = False,
openmpi_oversubscribe: bool = False,
pmi_mode: Optional[str] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Avoid breaking API: keep pmi_mode positional slot and make new flags keyword-only

Inserting exclusive and openmpi_oversubscribe before pmi_mode changes the positional argument order and will break callers that passed pmi_mode positionally. Preserve backward compatibility by keeping pmi_mode as the 4th parameter and making the new flags keyword-only.

Apply this diff to the function signature:

-    exclusive: bool = False,
-    openmpi_oversubscribe: bool = False,
-    pmi_mode: Optional[str] = None,
+    pmi_mode: Optional[str] = None,
+    *,
+    exclusive: bool = False,
+    openmpi_oversubscribe: bool = False,

To locate potential breakages (calls with four positional args), run:


🏁 Script executed:

#!/bin/bash
# Find calls using four positional args to get_cache_execute_command (likely broken by the reordering).
ast-grep --pattern $'get_cache_execute_command($_, $_, $_, $_)'

Length of output: 915


Avoid breaking API: keep pmi_mode as the 4th positional argument and make new flags keyword-only

I ran an AST search—every existing call uses keyword arguments, so there’s no internal breakage. However, to preserve the public API for callers who may pass pmi_mode positionally, update the signature in executorlib/standalone/command.py as follows:

@@ def get_cache_execute_command(
-    exclusive: bool = False,
-    openmpi_oversubscribe: bool = False,
-    pmi_mode: Optional[str] = None,
+    pmi_mode: Optional[str] = None,
+    *,
+    exclusive: bool = False,
+    openmpi_oversubscribe: bool = False,

This keeps pmi_mode in its original (4th) slot and forces exclusive and openmpi_oversubscribe to be keyword-only.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
exclusive: bool = False,
openmpi_oversubscribe: bool = False,
pmi_mode: Optional[str] = None,
pmi_mode: Optional[str] = None,
*,
exclusive: bool = False,
openmpi_oversubscribe: bool = False,
🤖 Prompt for AI Agents
In executorlib/standalone/command.py around lines 24 to 26, the function
signature currently lists exclusive, openmpi_oversubscribe, pmi_mode in that
order; update it so pmi_mode remains the 4th positional parameter and make
exclusive and openmpi_oversubscribe keyword-only: reorder the parameters so
pmi_mode: Optional[str] = None appears before exclusive and
openmpi_oversubscribe, and place a bare * before exclusive to force those two to
be keyword-only, keeping types and defaults unchanged.

Comment on lines +35 to +37
exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False.
openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False.
pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Docstring fixes: correct typo and reflect backend limitations

  • Typo: “compute notes” → “compute nodes”.
  • Please document that exclusive and openmpi_oversubscribe are not supported with the Flux backend (you raise ValueError for both).

Apply this diff:

-        exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False.
-        openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False.
-        pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None
+        pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix). Default is None.
+        exclusive (bool): Whether to exclusively reserve the compute nodes (no node sharing). Default: False.
+                          Note: Not supported with the "flux" backend.
+        openmpi_oversubscribe (bool): Whether to oversubscribe the cores. Default: False.
+                                      Note: Not supported with the "flux" backend.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
exclusive (bool): Whether to exclusively reserve the compute nodes, or allow sharing compute notes. Defaults to False.
openmpi_oversubscribe (bool, optional): Whether to oversubscribe the cores. Defaults to False.
pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix) default is None
pmi_mode (str): PMI interface to use (OpenMPI v5 requires pmix). Default is None.
exclusive (bool): Whether to exclusively reserve the compute nodes (no node sharing). Default: False.
Note: Not supported with the "flux" backend.
openmpi_oversubscribe (bool): Whether to oversubscribe the cores. Default: False.
Note: Not supported with the "flux" backend.
🤖 Prompt for AI Agents
In executorlib/standalone/command.py around lines 35 to 37, fix the docstring
typo "compute notes" to "compute nodes" and update the parameter descriptions to
note backend limitations: state that both exclusive and openmpi_oversubscribe
are not supported when using the Flux backend (they will raise ValueError), and
keep the pmi_mode description as-is (default None). Ensure the updated docstring
clearly mentions these Flux backend restrictions for exclusive and
openmpi_oversubscribe.

Comment on lines +56 to +57
if exclusive:
command_prepend += ["--exact"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Wrong Slurm flag for exclusive allocation — use --exclusive, not --exact

--exact controls task placement semantics, not node exclusivity. For exclusive node allocation with srun, the correct flag is --exclusive.

Apply this diff:

-            if exclusive:
-                command_prepend += ["--exact"]
+            if exclusive:
+                command_prepend += ["--exclusive"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if exclusive:
command_prepend += ["--exact"]
if exclusive:
command_prepend += ["--exclusive"]
🤖 Prompt for AI Agents
In executorlib/standalone/command.py around lines 56-57, the code appends the
wrong Slurm flag "--exact" when requesting an exclusive allocation; change the
appended flag to "--exclusive" so that command_prepend uses the correct srun
option for node exclusivity (i.e., replace "--exact" with "--exclusive" in the
exclusive branch).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/test_standalone_command.py (2)

54-63: Reduce assertion noise: compare the fixed prefix as a list.

Index-by-index checks are verbose and harder to maintain. Assert the fixed prefix in one shot, then keep the path-based checks for the final elements.

-        self.assertEqual(output[0], "srun")
-        self.assertEqual(output[1], "-n")
-        self.assertEqual(output[2], str(2))
-        self.assertEqual(output[3], "--mpi=pmi2")
-        self.assertEqual(output[4], "--oversubscribe")
-        self.assertEqual(output[5], "--exact")
+        # Assert fixed prefix in one go (adjust "--exact" to "--exclusive" if that is the intended behavior)
+        self.assertEqual(
+            output[:6],
+            ["srun", "-n", str(2), "--mpi=pmi2", "--oversubscribe", "--exact"],
+        )
         self.assertEqual(output[6], sys.executable)
         self.assertEqual(output[7].split(os.sep)[-1], "cache_parallel.py")
         self.assertEqual(output[8], file_name)

91-94: Also assert exception messages for Flux to lock in UX and avoid silent regressions.

Validating the message makes the tests stricter and ensures users get actionable errors.

-        with self.assertRaises(ValueError):
-            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", openmpi_oversubscribe=True)
-        with self.assertRaises(ValueError):
-            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", exclusive=True)
+        with self.assertRaisesRegex(ValueError, r"openmpi_oversubscribe.*flux backend"):
+            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", openmpi_oversubscribe=True)
+        with self.assertRaisesRegex(ValueError, r"exclusive.*flux backend"):
+            get_cache_execute_command(cores=2, file_name=file_name, backend="flux", exclusive=True)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d28b0ab and 88a5fd4.

📒 Files selected for processing (1)
  • tests/test_standalone_command.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/test_standalone_command.py (1)
executorlib/standalone/command.py (1)
  • get_cache_execute_command (20-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: unittest_old
  • GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
  • GitHub Check: unittest_flux_mpich
  • GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
  • GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
  • GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
  • GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
  • GitHub Check: unittest_openmpi (macos-latest, 3.13)
  • GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
  • GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
  • GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
  • GitHub Check: unittest_slurm_mpich
  • GitHub Check: unittest_win
  • GitHub Check: unittest_flux_openmpi
  • GitHub Check: notebooks_integration
🔇 Additional comments (1)
tests/test_standalone_command.py (1)

54-63: No backward-incompatible positional calls found
An AST-based scan of the entire repo confirmed there are no calls to get_cache_execute_command with more than two positional arguments. All existing usages rely on keyword arguments for newer parameters—no further action needed.

Comment on lines +54 to +63
output = get_cache_execute_command(cores=2, file_name=file_name, backend="slurm", pmi_mode="pmi2", openmpi_oversubscribe=True, exclusive=True)
self.assertEqual(output[0], "srun")
self.assertEqual(output[1], "-n")
self.assertEqual(output[2], str(2))
self.assertEqual(output[3], "--mpi=pmi2")
self.assertEqual(output[4], sys.executable)
self.assertEqual(output[5].split(os.sep)[-1], "cache_parallel.py")
self.assertEqual(output[6], file_name)
self.assertEqual(output[4], "--oversubscribe")
self.assertEqual(output[5], "--exact")
self.assertEqual(output[6], sys.executable)
self.assertEqual(output[7].split(os.sep)[-1], "cache_parallel.py")
self.assertEqual(output[8], file_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm intent: exclusive currently asserts --exact, which likely should be Slurm's --exclusive.

Docstring semantics (“exclusively reserve the compute nodes”) map to Slurm’s --exclusive, not --exact. --exact has different semantics and does not enforce exclusive node allocation. If the intent is exclusive node allocation, both the implementation and the test should use --exclusive. If the intent is “exact task placement,” then the parameter name and docstring should be updated accordingly to avoid confusion.

Apply this test-only diff if exclusive nodes are intended:

-        self.assertEqual(output[5], "--exact")
+        self.assertEqual(output[5], "--exclusive")

And update the Slurm path in executorlib/standalone/command.py accordingly:

# Replace the --exact flag with --exclusive for 'exclusive=True'
if exclusive:
    command_prepend += ["--exclusive"]

To verify the correct Slurm flag for exclusive node allocation:


🌐 Web query:

In Slurm's srun, which option enforces exclusive node allocation: --exclusive or --exact? Also, what is the meaning of --exact?

💡 Result:

The correct option in Slurm's srun to enforce exclusive node allocation is --exclusive. The --exact option does something notably different.

Option purposes:

  • --exclusive (for srun, sbatch, or salloc): This option requests that the allocation consist of whole, exclusive nodes for the job or job step, meaning no other jobs can run on the same nodes during the allocation period—even if your job doesn't use all CPUs or resources available on those nodes[1][4]. This ensures your job has access to all resources of the assigned nodes and prevents sharing them with other users/jobs during your job's run.

  • --exact (for srun only): This option restricts each job step to exactly the amount of resources requested, allocating only the cores specified per job step[2][9][10]. Without --exact, job steps may get as many resources as are still available in the job allocation, which may result in one step grabbing all resources and later steps being blocked[2][10]. With --exact, each step receives precisely the number of CPUs/tasks you request, no more, no less; this is particularly used for fine control when launching multiple simultaneous steps within a single allocation.

Summary Table

Option Enforces exclusive node allocation? Purpose/Effect
--exclusive Yes Allocates whole nodes exclusively to your job; no sharing with other jobs.
--exact No Ensures each srun step gets exactly the requested resources (e.g., 1 CPU per step), no more.

References for option meanings:

  • [--exclusive documentation][1][4]
  • [--exact explanation][2][9][10]

In summary:
Use --exclusive for exclusive node allocations. The --exact option is used for controlling resource distribution for job steps within an already-allocated job, not for exclusive node allocation.

Citations:


Replace --exact with Slurm’s --exclusive for true exclusive node allocation

Slurm’s --exclusive enforces whole-node exclusivity; --exact only limits per-step resource counts. To align intent and documentation, update both test and implementation:

• tests/test_standalone_command.py (around line 60):

-        self.assertEqual(output[5], "--exact")
+        self.assertEqual(output[5], "--exclusive")

• executorlib/standalone/command.py (where exclusive=True is handled):

-    if exclusive:
-        command_prepend += ["--exact"]
+    if exclusive:
+        command_prepend += ["--exclusive"]
🤖 Prompt for AI Agents
In tests/test_standalone_command.py around lines 54 to 63 and
executorlib/standalone/command.py where exclusive=True is handled, the test and
implementation currently add Slurm’s --exact but should use --exclusive to
enforce whole-node exclusivity; update the implementation to emit "--exclusive"
instead of "--exact" when exclusive=True and update the test expectation to
assert "--exclusive" at the same index (replace the assertion for "--exact" with
"--exclusive"), keeping all other arguments and ordering unchanged.

@jan-janssen jan-janssen merged commit b3cf4c8 into main Aug 19, 2025
34 checks passed
@jan-janssen jan-janssen deleted the cache_command branch August 19, 2025 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants