[Bug] `SlurmClusterExecutor` resubmitting jobs that have an existing cache

I'm running the following notebook cell on MPIsusmat's CM cluster:

```python
import os
import time

from executorlib import SlurmClusterExecutor

def foo(x):
    time.sleep(10)
    return x + 1

with SlurmClusterExecutor(
    cache_directory=os.path.join(os.getcwd(), "foo_dir"),
    resource_dict={
        "partition": "s.cmfe", 
        "cores": 1,
    }
) as exe:
    future = exe.submit(foo, 1)
    print(future.result())
```

Initial execution is fine: everything takes a few seconds, because of the `sleep` command I have a good chance to `squeue | grep $USER` and see my submit job, and the `foo_dir` directory gets created and I can see `run_queue.sh` and `time.out` and am able to watch `foo..._i.h5` appear and disappear and am nicely left with `foo..._o.h5`.

If I re-execute the cell, everything looks fine from the perspective of my notebook: it prints my result (`2`) rather quickly. Certainly much more quickly than my `sleep(10)`, so it is definitely using the cache.

_But_, if I am watching my directory, I can again see `foo..._i.h5` get written (with the same key), and I can `squeue | grep $USER` and `sacct -u $USER --start=2025-06-20 | wc -l` to see that SLURM is re-running the job.

Since the cache is nicely leveraged in-notebook for my return, I am interpreting this re-submission as a bug. The quick return value implies to me that there is somewhere there is an effective `if cache_hit()` check, and probably this is as simple as some `submit` getting a safety valve like `if cache_hit: return get_cache(); else: actuall_submit()`, but I skimmed through the `def submit(` and couldn't see where myself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] `SlurmClusterExecutor` resubmitting jobs that have an existing cache #688

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] SlurmClusterExecutor resubmitting jobs that have an existing cache #688

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug] `SlurmClusterExecutor` resubmitting jobs that have an existing cache #688