-
Notifications
You must be signed in to change notification settings - Fork 9
Patch tmp_path/tmpdir to be thread-safe #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Awesome. @lysnikolaou assuming you're back next week, mind looking this over? Since you haven't been involved in the discussions around this. |
lysnikolaou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Good work @bwhitt7!
I've left some fairly minor comments inline.
|
@lysnikolaou Apologies for the late push, hopefully this should address the changes you suggested! Decided to return the values that were originally in the data var as a tuple. Saving things in kwargs can get a little tricky since pytest gets upset when you have extra kwargs that don't match with anything in the test function. Also, I tested out if directories still get cleaned up with the tmp_path value modified, and it looks like everything gets handled normally. You can look into the pytest temporary directories and see that the thread directories get created and deleted properly. |
|
If you edit the PR description to include the string “fixes #109”, that issue will auto-close when this PR is merged. Just an FYI for future PRs that fix an issue. |
|
The changes look good @bwhitt7! Thanks! However, because the changes not only affect collection time, but also runtime, I benchmarked using With I think we should spend more time trying to benchmark and optimize this. @ngoldbaum What do you think? |
Seems reasonable. I'll circle back with @bwhitt7. Thanks for checking! |
|
@lysnikolaou I tried to reproduce that - I see on this PR that the Can you share a little bit more about how you set up your test? Maybe something specific about your test machine? Maybe also if you're on Linux you can get a samply profile comparing this PR vs pytest-run-parallel 0.6.1, happy to help out explaining how to do that. Any kind of profiling you can do on your setup where you're seeing a really big slowdown would help us understand what's going on. |
The tests above are with a debug build of CPython 3.13.3t. I just tested this with a 3.14.0rc2t build as well (non-debug) and the regression is even more acute at ~31%. My machine is a MacBook Pro with an M1 pro with 10 CPU cores. I'm running macOS Sequoia 15.6.1. The test invocation I'm using is |
Thanks for the hint about |
|
Thanks for the feedback @lysnikolaou! Since the drastic slowdown happens with |
|
Can we try to create all of the directories before the actual And let's also try to inline both functions. Function calls in Python do add some overhead, though I'm not sure how much of a difference that will make, especially if it's once per thread. |
|
Made the functions inline. With the iterations, looks like creating the directories outside of the iterations loop doesn't improve performance, instead resulting in significant performance cost similar to your testing @lysnikolaou . I think the performance cost is mainly the creation and destruction of so many directories anyways. |
lysnikolaou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work @bwhitt7! Thanks for all the patience with the reviews!
I'm approving, but I left one minor comment. After that's addressed, it's certainly good to go.
tests/test_tmp_path.py
Outdated
|
|
||
| @pytest.mark.parametrize("parallel, passing", parallel_threads) | ||
| def test_tmp_path_tmpdir(pytester: pytest.Pytester, parallel, passing): | ||
| # ensures we can delete files in each tmpdir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not relevant, should probably be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha yep, forgot to change that. Just pushed a commit that fixes that.
And no worries!
|
Hmmmm looks like the tests are failing now, seems to be related to that |
This seems to be the relevant part of the exception: If both |
|
Made it so if |
|
Thanks @bwhitt7! |
This PR fixes #109, hoping to address the issue of tmp_path not being thread safe. The introduced changes will patch tmp_path (and tmpdir), creating sub-directories for each thread/iteration within the original path returned by tmp_path. The tmp_path fixture will then be set to the new path, so that tests using the fixture will have a new directory for each thread and iteration. This should prevent threads from reading/writing into the same files and causing conflicts.
This PR also introduces a change to the closure function in wrap_function_parallel, with functions that handle setup for patching fixtures. The thread_index and iteration_index code was moved to these setup functions, and the patching for tmp_path/tmpdir is handled here too.