bpo-24882: Let ThreadPoolExecutor reuse idle threads before creating new thread #6375

iUnknwn · 2018-04-04T21:37:33Z

The change fixes an issue where ThreadPoolExecutor doesn't check if existing threads are idle before spinning up new ones. To do this, we use a simple counter within the Executor to track how many threads are idle, and atomically increment and decrement the count as needed.

One question - the previous code to spin up a new thread in _adjust_thread_count does not appear thread safe - if two threads are both submitting items to the executor at the same time, the call to check the number of threads could be invalid. Is this something that should also be fixed, or is this not an issue because of the global interpreter lock?

https://bugs.python.org/issue24882

pitrou

As discussed on the issue tracker, I don't think it's a good idea to add this.

iUnknwn · 2018-04-13T19:40:19Z

Added comments on the issue tracker replying - I was hoping the original author would comment first, as they had a TODO requesting that this issue be fixed in the future.

Also, it's been more than a week since I submitted my CLA, should I be concerned?

pitrou · 2018-04-13T20:24:21Z

By the way:

Also, it's been more than a week since I submitted my CLA, should I be concerned?

I think CLA processing is a bit slow lately. If this persists I'll try pinging the PSF.

MojoVampire · 2019-05-08T19:56:24Z

Lib/test/test_concurrent_futures.py

@@ -318,7 +318,7 @@ def test_threads_terminate(self):
        self.executor.submit(mul, 21, 2)
        self.executor.submit(mul, 6, 7)
        self.executor.submit(mul, 3, 14)
-        self.assertEqual(len(self.executor._threads), 3)
+        self.assertTrue(len(self.executor._threads) < 3)


You've submitted three things, so why wouldn't it be possible for it to have launched three threads? Granted, the work per task is cheap, so it's possible (maybe even likely, given GIL interference) that the first task is done in time for its thread to be reused by the third task, but that's not guaranteed, is it?

I agree that this test might fail randomly. It would be better to force the collection of the 3 tasks with calls to result before launching the next one and verify that only one thread has been used.

pitrou · 2019-05-18T14:35:00Z

@tomMoral Would you like to give this a review?

tomMoral

The implementation seems to be doing the job. I think it would be a bit simpler by using a Semaphore instead of a lock protected counter.

I am unsure about the race conditions that might occur. Here, I do not see any obvious ones but to be on the safe side, it would be nice to have a test checking that you can indeed saturate the ThreadPoolExecutor with many small tasks. for instance:

def test_saturation(self):
    list(self.executor.map(mul, range(100 * self.executor._max_workers), y=0))
    self.assertEqual(len(self.executor._threads), self.executor._max_workers)

tomMoral · 2019-05-19T16:06:08Z

Lib/concurrent/futures/thread.py

@@ -129,6 +136,8 @@ def __init__(self, max_workers=None, thread_name_prefix='',

        self._max_workers = max_workers
        self._work_queue = queue.SimpleQueue()
+        self._idle_lock = threading.Lock()
+        self._idle_count = 0


Why don't you use a threading.Semaphore for this?
It makes the implementation easier to read as you don't need to handle the lock + increments but only rely on acquire and release.

tomMoral · 2019-05-19T16:07:36Z

Lib/test/test_concurrent_futures.py

@@ -318,7 +318,7 @@ def test_threads_terminate(self):
        self.executor.submit(mul, 21, 2)
        self.executor.submit(mul, 6, 7)
        self.executor.submit(mul, 3, 14)
-        self.assertEqual(len(self.executor._threads), 3)
+        self.assertTrue(len(self.executor._threads) < 3)


I agree that this test might fail randomly. It would be better to force the collection of the 3 tasks with calls to result before launching the next one and verify that only one thread has been used.

tomMoral · 2019-05-19T16:10:40Z

Lib/concurrent/futures/thread.py

+        with self._idle_lock:
+            if self._idle_count > 0:
+                self._idle_count -= 1
+                return


If _idle_count is a semaphore, you simply need to use:

if self._idle_count.acquire(timeout=0): return

tomMoral · 2019-05-19T16:12:24Z

Lib/concurrent/futures/thread.py

+                # attempt to increment idle count
+                executor = executor_reference()
+                if executor is not None:
+                    executor._increase_idle_count()


If _idle_count is a semaphore, you just need to increase the number with release:

executor._idle_count.release()

csabella · 2019-05-20T21:09:23Z

Thank you for the reviews of this issue. Please note that GH-6530 is alternative implementation by the same author. If this one looks more likely to be accepted than that one, I propose closing the other issue.

@iUnknwn, please take a look at the code review. Thanks!

pitrou · 2019-05-20T21:24:04Z

Since GH-6530 I've changed my opinion so I'm closing that PR.

iUnknwn · 2019-05-20T22:59:58Z

Thank you for the review. I've made the suggested changes.

csabella · 2019-05-21T11:56:24Z

@tomMoral, please re-review. Thanks!

pitrou

Thank you for the update. Here are a few more comments, mostly around tests.

Lib/test/test_concurrent_futures.py

Lib/concurrent/futures/thread.py

bedevere-bot · 2019-05-21T21:46:08Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

iUnknwn · 2019-05-21T22:31:16Z

I have made the requested changes; please review again.

One note - I wasn't 100% sure what best practice would be for the tests in the ThreadPoolExecutorTest class. Since I wanted a clean executor for both the saturation test, and the idle reuse test, I ended up creating a new executor for each test (instead of using the one within the class instance). Hopefully that is acceptable.

bedevere-bot · 2019-05-21T22:31:20Z

Thanks for making the requested changes!

@methane, @pitrou: please review the changes made to this pull request.

Adjust the shutdown test so that, after submitting three jobs to the executor, the test checks for less than three threads, instead of looking for exactly three threads. If idle threads are being recycled properly, then we should have less than three threads.

As suggested by reviewer tomMoral, swapped lock-protected counter with a semaphore to track the number of unused threads. Adjusted test_threads_terminate to wait for completiton of the previous future before submitting a new one (and checking the number of threads used). Also added a new test to confirm the thread pool can be saturated.

pitrou · 2019-05-22T20:59:52Z

Thanks @iUnknwn. I've rebased and will merge if CI is green.

bedevere-bot · 2019-05-22T21:30:01Z

@pitrou: Please replace # with GH- in the commit message next time. Thanks!

tzongw · 2019-12-03T14:30:21Z

Lib/concurrent/futures/thread.py

+                # attempt to increment idle count
+                executor = executor_reference()
+                if executor is not None:
+                    executor._idle_semaphore.release()


Is _idle_semaphore presenting idle threads? If more than max_workers jobs is submited at the same time, won't _idle_semaphore value greater than the thread count?

This comment has been minimized.

Sign in to view

the-knights-who-say-ni added the CLA not signed label Apr 4, 2018

bedevere-bot added the awaiting review label Apr 4, 2018

iUnknwn changed the title ~~bppo-24882: Fix issue where ThreadPoolExecutor doesn't reuse threads~~ bpo-24882: Fix issue where ThreadPoolExecutor doesn't reuse threads Apr 4, 2018

pitrou reviewed Apr 13, 2018

View reviewed changes

iUnknwn mentioned this pull request Apr 19, 2018

bpo-24882: eagerly spawn threads in ThreadPoolExecutor #6530

Closed

Mariatta removed the CLA not signed label Jun 15, 2018

the-knights-who-say-ni added the CLA signed label Jun 15, 2018

MojoVampire reviewed May 8, 2019

View reviewed changes

tomMoral reviewed May 19, 2019

View reviewed changes

methane approved these changes May 21, 2019

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels May 21, 2019

methane changed the title ~~bpo-24882: Fix issue where ThreadPoolExecutor doesn't reuse threads~~ bpo-24882: Let ThreadPoolExecutor reuses idle threads before creating new thread May 21, 2019

csabella requested a review from pitrou May 21, 2019 11:55

pitrou requested changes May 21, 2019

View reviewed changes

Lib/test/test_concurrent_futures.py Outdated Show resolved Hide resolved

Lib/test/test_concurrent_futures.py Show resolved Hide resolved

Lib/test/test_concurrent_futures.py Show resolved Hide resolved

Lib/concurrent/futures/thread.py Outdated Show resolved Hide resolved

bedevere-bot added awaiting changes and removed awaiting merge labels May 21, 2019

pitrou changed the title ~~bpo-24882: Let ThreadPoolExecutor reuses idle threads before creating new thread~~ bpo-24882: Let ThreadPoolExecutor reuse idle threads before creating new thread May 21, 2019

bedevere-bot added awaiting change review and removed awaiting changes labels May 21, 2019

iUnknwn and others added 7 commits May 22, 2019 22:54

Fixes issue 24882

e609c62

Add news file entry for change.

3fa038e

Updates tests as requested by pitrou.

ea0de69

Correct minor whitespace error.

3c99c9c

Make test_saturation faster

7535276

pitrou approved these changes May 22, 2019

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting change review labels May 22, 2019

pitrou force-pushed the fix-issue-24882 branch from c611e0c to 7535276 Compare May 22, 2019 20:59

pitrou merged commit 904e34d into python:master May 22, 2019

bedevere-bot removed the awaiting merge label May 22, 2019

tzongw reviewed Dec 3, 2019

View reviewed changes

fcollonval mentioned this pull request Jun 3, 2021

lab get lag when many kernels are spawned jupyter-server/jupyter_server#530

Closed

methane mentioned this pull request Dec 8, 2023

bpo-35279: reduce default max_workers of ThreadPoolExecutor #13618

Merged

Uh oh!

bpo-24882: Let ThreadPoolExecutor reuse idle threads before creating new thread #6375

bpo-24882: Let ThreadPoolExecutor reuse idle threads before creating new thread #6375

Uh oh!

Conversation

iUnknwn commented Apr 4, 2018 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

iUnknwn commented Apr 13, 2018

Uh oh!

pitrou commented Apr 13, 2018

Uh oh!

MojoVampire May 8, 2019

Choose a reason for hiding this comment

Uh oh!

tomMoral May 19, 2019

Choose a reason for hiding this comment

Uh oh!

pitrou commented May 18, 2019

Uh oh!

tomMoral left a comment

Choose a reason for hiding this comment

Uh oh!

tomMoral May 19, 2019

Choose a reason for hiding this comment

Uh oh!

tomMoral May 19, 2019

Choose a reason for hiding this comment

Uh oh!

tomMoral May 19, 2019

Choose a reason for hiding this comment

Uh oh!

tomMoral May 19, 2019

Choose a reason for hiding this comment

Uh oh!

csabella commented May 20, 2019

Uh oh!

pitrou commented May 20, 2019

Uh oh!

iUnknwn commented May 20, 2019

Uh oh!

csabella commented May 21, 2019

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bedevere-bot commented May 21, 2019

Uh oh!

iUnknwn commented May 21, 2019

Uh oh!

bedevere-bot commented May 21, 2019

Uh oh!

pitrou commented May 22, 2019

Uh oh!

bedevere-bot commented May 22, 2019

Uh oh!

tzongw Dec 3, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

iUnknwn commented Apr 4, 2018 •

edited by bedevere-bot

Loading