Patch tmp_path/tmpdir to be thread-safe #120

bwhitt7 · 2025-08-28T18:46:00Z

This PR fixes #109, hoping to address the issue of tmp_path not being thread safe. The introduced changes will patch tmp_path (and tmpdir), creating sub-directories for each thread/iteration within the original path returned by tmp_path. The tmp_path fixture will then be set to the new path, so that tests using the fixture will have a new directory for each thread and iteration. This should prevent threads from reading/writing into the same files and causing conflicts.

This PR also introduces a change to the closure function in wrap_function_parallel, with functions that handle setup for patching fixtures. The thread_index and iteration_index code was moved to these setup functions, and the patching for tmp_path/tmpdir is handled here too.

ngoldbaum · 2025-08-28T18:48:19Z

Awesome. @lysnikolaou assuming you're back next week, mind looking this over? Since you haven't been involved in the discussions around this.

lysnikolaou

This looks great! Good work @bwhitt7!

I've left some fairly minor comments inline.

src/pytest_run_parallel/plugin.py

bwhitt7 · 2025-09-05T15:42:14Z

@lysnikolaou Apologies for the late push, hopefully this should address the changes you suggested! Decided to return the values that were originally in the data var as a tuple. Saving things in kwargs can get a little tricky since pytest gets upset when you have extra kwargs that don't match with anything in the test function.

Also, I tested out if directories still get cleaned up with the tmp_path value modified, and it looks like everything gets handled normally. You can look into the pytest temporary directories and see that the thread directories get created and deleted properly.

ngoldbaum · 2025-09-05T18:02:11Z

If you edit the PR description to include the string “fixes #109”, that issue will auto-close when this PR is merged. Just an FYI for future PRs that fix an issue.

lysnikolaou · 2025-09-08T14:35:38Z

The changes look good @bwhitt7! Thanks!

However, because the changes not only affect collection time, but also runtime, I benchmarked using SciPy. I ran the tests in scipy/io/tests. And here are the results:

With pytest-run-parallel v0.6.1 (release on NumPy) I'm getting two test failures with 376/561 items running in parallel. On average (out of 5 times), the whole test suite lasted 83.994s. With this branch I'm getting no test failures (🎉) with 376/561 items running in parallel (same number of collected tests), but the test suite takes significantly more to run in an average of 93.542s. That's a runtime increase of 11.37%, which is significant, especially for large test suites.

I think we should spend more time trying to benchmark and optimize this. @ngoldbaum What do you think?

ngoldbaum · 2025-09-08T15:05:00Z

I think we should spend more time trying to benchmark and optimize this. @ngoldbaum What do you think?

Seems reasonable. I'll circle back with @bwhitt7. Thanks for checking!

ngoldbaum · 2025-09-09T19:41:31Z

@lysnikolaou I tried to reproduce that - I see on this PR that the scipy/io/tests tests run slightly faster with --parallel-threads=auto, finishing in 3.49 seconds vs 3.53 seconds using pytest-run-parallel 0.6.1.

Can you share a little bit more about how you set up your test? Maybe something specific about your test machine?

Maybe also if you're on Linux you can get a samply profile comparing this PR vs pytest-run-parallel 0.6.1, happy to help out explaining how to do that. Any kind of profiling you can do on your setup where you're seeing a really big slowdown would help us understand what's going on.

lysnikolaou · 2025-09-10T11:06:37Z

Can you share a little bit more about how you set up your test?

The tests above are with a debug build of CPython 3.13.3t. I just tested this with a 3.14.0rc2t build as well (non-debug) and the regression is even more acute at ~31%.

My machine is a MacBook Pro with an M1 pro with 10 CPU cores. I'm running macOS Sequoia 15.6.1. The test invocation I'm using is --parallel-threads=10 --iterations=20. That's probably the difference we're seeing, as I'm seeing the same results (a very slight speedup) without the --iterations part. I think having an --iterations is much closer to real-world usage and because wrap_setup_iteration is called inside the --iterations loop, performance slows down linearly to the number of iterations. Can you confirm that? If not, I can try to get a profile to delve deeper into this.

ngoldbaum · 2025-09-10T15:30:45Z

Can you confirm that? If not, I can try to get a profile to delve deeper into this.

Thanks for the hint about --iterations. I can reproduce the slowdown now.

bwhitt7 · 2025-09-10T16:57:31Z

Thanks for the feedback @lysnikolaou! Since the drastic slowdown happens with --iterations, me and Nathan agreed that it probably would be a good idea to remove iteration support for tmp_path for now since it adds a large number of temporary directories to create and remove, but I wanted to know what your opinion of this is. I suppose if users wanted to run iterations with tmp_path, they can edit their tests to be iteration-safe.

lysnikolaou · 2025-09-11T12:13:11Z

Can we try to create all of the directories before the actual --iterations loop and benchmark that as well? It feels like that might be much closer to our current performance.

And let's also try to inline both functions. Function calls in Python do add some overhead, though I'm not sure how much of a difference that will make, especially if it's once per thread.

bwhitt7 · 2025-09-12T20:51:47Z

Made the functions inline. With the iterations, looks like creating the directories outside of the iterations loop doesn't improve performance, instead resulting in significant performance cost similar to your testing @lysnikolaou . I think the performance cost is mainly the creation and destruction of so many directories anyways.

lysnikolaou

LGTM! Great work @bwhitt7! Thanks for all the patience with the reviews!

I'm approving, but I left one minor comment. After that's addressed, it's certainly good to go.

lysnikolaou · 2025-09-16T13:45:05Z

tests/test_tmp_path.py

+
+@pytest.mark.parametrize("parallel, passing", parallel_threads)
+def test_tmp_path_tmpdir(pytester: pytest.Pytester, parallel, passing):
+    # ensures we can delete files in each tmpdir


This comment is not relevant, should probably be deleted.

Haha yep, forgot to change that. Just pushed a commit that fixes that.
And no worries!

bwhitt7 · 2025-09-16T14:27:43Z

Hmmmm looks like the tests are failing now, seems to be related to that test_tmp_path_tmpdir test. Everything runs fine locally for me, will look into this.

lysnikolaou · 2025-09-16T14:39:37Z

Hmmmm looks like the tests are failing now, seems to be related to that test_tmp_path_tmpdir test. Everything runs fine locally for me, will look into this.

This seems to be the relevant part of the exception:

E           and: '    | Traceback (most recent call last):'
E           and: '    |   File "/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/threading.py", line 1043, in _bootstrap_inner'
E           and: '    |     self.run()'
E           and: '    |     ~~~~~~~~^^'
E           and: '    |   File "/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/threading.py", line 994, in run'
E           and: '    |     self._target(*self._args, **self._kwargs)'
E           and: '    |     ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^'
E           and: '    |   File "/home/runner/work/pytest-run-parallel/pytest-run-parallel/.tox/3.13/lib/python3.13/site-packages/pytest_run_parallel/plugin.py", line 70, in closure'
E           and: '    |     kwargs["tmpdir"] = kwargs["tmpdir"].mkdir('
E           and: '    |                        ~~~~~~~~~~~~~~~~~~~~~~^'
E           and: '    |         f"thread_{thread_index!s}"'
E           and: '    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^'
E           and: '    |     )'
E           and: '    |     ^'
E           and: '    |   File "/home/runner/work/pytest-run-parallel/pytest-run-parallel/.tox/3.13/lib/python3.13/site-packages/_pytest/_py/path.py", line 889, in mkdir'
E           and: '    |     error.checked_call(os.mkdir, os.fspath(p))'
E           and: '    |     ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^'
E           and: '    |   File "/home/runner/work/pytest-run-parallel/pytest-run-parallel/.tox/3.13/lib/python3.13/site-packages/_pytest/_py/error.py", line 111, in checked_call'
E           and: '    |     raise cls(f"{func.__name__}{args!r}")'
E           and: "    | py.error.EEXIST: [File exists]: mkdir('/tmp/pytest-of-runner/pytest-0/basetemp/test_both0/thread_1',)"
E           and: '    | '

If both tmp_path and tmpdir are in the fixtures and they both point to the same directory, we try to create the same thread-specific dir twice.

ngoldbaum · 2025-09-16T14:40:44Z

This is happening because we merged PR #126 and no one noticed in earlier rounds of review that one of the tests you added raises an exception. Pytest treats unhandled exceptions in a thread as a warning, so it wasn't failing the tests until #126 got merged.

bwhitt7 · 2025-09-16T15:16:26Z

Made it so if tmp_path and tmpdir are used at the same time, they won't raise warnings if they create the same directory. Also made the test_tmp_path_tmpdir test a little more useful hehe. Looks like this fixed the test failures!

ngoldbaum · 2025-09-16T20:51:49Z

Thanks @bwhitt7!

bwhitt7 added 4 commits August 27, 2025 20:30

tmp_path fixture patched to be thread-safe

806200b

Create tests for tmp_path/tmpdir

4db300d

Add tmp_path/tmpdir changes to README

8dc8ce4

New tmp_path test param

f79c35b

lysnikolaou reviewed Sep 4, 2025

View reviewed changes

src/pytest_run_parallel/plugin.py Outdated Show resolved Hide resolved

src/pytest_run_parallel/plugin.py Outdated Show resolved Hide resolved

src/pytest_run_parallel/plugin.py Outdated Show resolved Hide resolved

src/pytest_run_parallel/plugin.py Outdated Show resolved Hide resolved

Modify tmp_path patches

e36186c

bwhitt7 added 2 commits September 10, 2025 12:45

Remove iteration support for tmp_path

121c82d

tmp_path README iteration update

1a37a5d

bwhitt7 added 2 commits September 12, 2025 15:08

Bring iteration support back

2df77e9

Remove iteration support again

3e1e203

lysnikolaou approved these changes Sep 16, 2025

View reviewed changes

Fix comment in test_tmp_path_tmpdir

41214f9

Handle if directories already exist

2cf1300

ngoldbaum merged commit 3d60fde into Quansight-Labs:main Sep 16, 2025
10 checks passed

ngoldbaum mentioned this pull request Sep 16, 2025

tmp_path-fixtured tests generally do not expect neither parallelism nor repetition #109

Closed

Patch tmp_path/tmpdir to be thread-safe #120

Patch tmp_path/tmpdir to be thread-safe #120

Uh oh!

Conversation

bwhitt7 commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commented Aug 28, 2025

Uh oh!

lysnikolaou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bwhitt7 commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commented Sep 5, 2025

Uh oh!

lysnikolaou commented Sep 8, 2025

Uh oh!

ngoldbaum commented Sep 8, 2025

Uh oh!

ngoldbaum commented Sep 9, 2025

Uh oh!

lysnikolaou commented Sep 10, 2025

Uh oh!

ngoldbaum commented Sep 10, 2025

Uh oh!

bwhitt7 commented Sep 10, 2025

Uh oh!

lysnikolaou commented Sep 11, 2025

Uh oh!

bwhitt7 commented Sep 12, 2025

Uh oh!

lysnikolaou left a comment

Choose a reason for hiding this comment

Uh oh!

lysnikolaou Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

bwhitt7 Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

bwhitt7 commented Sep 16, 2025

Uh oh!

lysnikolaou commented Sep 16, 2025

Uh oh!

ngoldbaum commented Sep 16, 2025

Uh oh!

bwhitt7 commented Sep 16, 2025

Uh oh!

ngoldbaum commented Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bwhitt7 commented Aug 28, 2025 •

edited

Loading

bwhitt7 commented Sep 5, 2025 •

edited

Loading