bpo-39812: Remove daemon threads in concurrent.futures #19149

aeros · 2020-03-25T00:30:39Z

Removes daemon threads from the Executor internals in concurrent.futures to allow for compatibility with subinterpreters (since they no longer support daemon threads with https://bugs.python.org/issue37266).

/cc @pitrou

https://bugs.python.org/issue39812

aeros · 2020-03-25T00:40:10Z

Lib/threading.py

+_threading_atexits = None
+
+def _register_atexit(func, *arg, **kwargs):
+    """CPython internal: register *func* to be called before joining threads.


I specified "CPython internal" instead of just "internal" to make it more clear that it's intended to be used across stdlib modules (as opposed to an internal helper function for threading).

At some point, I think this could graduate to the public API, but at the moment it's a bit too niche since subinterpreters have not yet officially made it into the stdlib. If such a utility is requested by users, I think it could be moved without much hassle though. That's partly why I included a detailed docstring.

This sounds good to me, thanks for the explanation.

aeros · 2020-03-25T00:42:50Z

Lib/concurrent/futures/process.py

-
-atexit.register(_python_exit)


This change is just to make the atexit registration be in the same location for both ProcessPoolExecutor and ThreadPoolExecutor (right under the definition for _python_exit()). It can be removed if needed, but I think it will make them easier to locate for those reading over the implementation details.

aeros · 2020-03-25T00:48:36Z

I considered adding a unit test for _threading_atexit() that would attempt to register a mock object, and ensure it was called once after thread shutdown process is complete. However, upon trying to implement it, I think it might be a bit more trouble than it's worth for a fairly simple internal function.

That being said, I think it would be worthwhile to add a test along those lines if it were to be added to the public API for threading.

aeros · 2020-03-25T00:52:00Z

Lib/concurrent/futures/process.py

-# Workers are created as daemon threads and processes. This is done to allow the
-# interpreter to exit when there are still idle processes in a
-# ProcessPoolExecutor's process pool (i.e. shutdown() was not called). However,
-# allowing workers to die with the interpreter has two undesirable properties:
-#   - The workers would still be running during interpreter shutdown,
-#     meaning that they would fail in unpredictable ways.
-#   - The workers could be killed while evaluating a work item, which could
-#     be bad if the callable being evaluated has external side-effects e.g.
-#     writing to a file.
-#
-# To work around this problem, an exit handler is installed which tells the
-# workers to exit when their work queues are empty and then waits until the
-# threads/processes finish.


I believe this section is no longer relevant now that the workers are no longer daemon threads. Correct me if I'm mistaken.

bedevere-bot · 2020-03-25T02:02:41Z

🤖 New build scheduled with the buildbot fleet by @aeros for commit 21a12e9 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

aeros · 2020-03-25T02:21:01Z

(Note that AMD64 Windows7 SP1 PR and x86 Windows7 PR can be safely disregarded since Windows 7 isn't supported on 3.x)

aeros · 2020-03-25T03:26:42Z

I'm 99.99% certain the test_socket failure in FreeBSD Shared PR is unrelated to the PR, as it has been occurring intermittently in recent commits to master as well. Also, it's further reinforced by the fact that this PR doesn't affect the socket module.

1 re-run test:
    test_socket
Total duration: 33 min 55 sec
Tests result: FAILURE then FAILURE
sys:1: ResourceWarning: unclosed <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=132, laddr=('127.0.0.1', 51870)>
ResourceWarning: Enable tracemalloc to get the object allocation traceback
sys:1: ResourceWarning: unclosed <socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=132, laddr=('127.0.0.1', 52586)>
ResourceWarning: Enable tracemalloc to get the object allocation traceback
sys:1: ResourceWarning: unclosed <socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=132, laddr=('127.0.0.1', 49295)>
ResourceWarning: Enable tracemalloc to get the object allocation traceback
*** Error code 2

aeros · 2020-03-25T09:39:45Z

I opened a separate issue to address the (unrelated) buildbot refleaks from test_asyncio at https://bugs.python.org/issue40061.

pitrou

Thanks for doing this. I think _register_atexit deserves at least some basic unit test(s). It should be fairly easy by spawning e.g. a subprocess and checking its output.

Lib/threading.py

pitrou · 2020-03-25T11:18:27Z

Lib/threading.py

+_threading_atexits = None
+
+def _register_atexit(func, *arg, **kwargs):
+    """CPython internal: register *func* to be called before joining threads.


This sounds good to me, thanks for the explanation.

Lib/threading.py

aeros · 2020-03-25T20:27:58Z

Thanks for doing this.

No problem, thanks for the review. :-)

I think _register_atexit deserves at least some basic unit test(s). It should be fairly easy by spawning e.g. a subprocess and checking its output.

Yeah, after having some additional time to think about it, I think we'd be better off with a couple of basic unit tests for it, even if it's internal. I'll work on those next.

aeros · 2020-03-25T22:38:54Z

Going to restart the CI (via close and re-open) since it failed in the "Install Dependencies" stage.

aeros · 2020-03-25T22:59:39Z

@pitrou I believe that I've addressed the review comments. Let me know if the unit tests I added are sufficient.

pitrou · 2020-03-27T17:28:52Z

(closing and reopening to trigger CI)

pitrou · 2020-03-27T17:35:10Z

Also, would it be worth adding a "What's New" entry to mention that the executors no longer rely on daemon threads?

That's a good idea.

pitrou · 2020-03-27T17:37:50Z

Oh, and yes, the unit tests look sufficient for an internal function :-)

aeros · 2020-03-27T18:28:50Z

@pitrou

(closing and reopening to trigger CI)

By the way, with GitHub actions for some reason, it leaves the previous CI failures and duplicates the tests when you close and re-open the PR. So some of the failures showing are from the first time (before I closed and re-opened it), that's why there's 3 of each GitHub actions test.

I'm not certain why they're failing in the early stages though. If it happens again, I'll start a thread on python-dev.

That's a good idea.

Great! I'll work on the "What's New" entry next; it shouldn't take long. :-)

(It should also clear the old CI tests since it's a different commit)

aeros · 2020-03-27T18:47:17Z

Doc/whatsnew/3.9.rst

+Removed daemon threads from :class:`~concurrent.futures.ThreadPoolExecutor`
+and :class:`~concurrent.futures.ProcessPoolExecutor`. This improves


Using a tilde ~ shortens the Sphinx markup link to appear as just the last part, I.E. ThreadPoolExecutor and ProcessPoolExecutor. I typically use this when the classes are easily distinguishable based on their names alone, and the section already makes it clear what module it's in. This helps to make it a bit more succinct.

pitrou

Thanks for doing this @aeros !

laike9m · 2020-10-26T07:04:08Z

Lib/concurrent/futures/process.py

        super().__init__()
-        self.daemon = True


I'm actually depending on this behavior and my code breaks in Python 3.9. Is there a way to set the threads to be daemon? The error I got:

Traceback (most recent call last): File "/Users/laike9m/.pyenv/versions/3.9.0/lib/python3.9/threading.py", line 950, in _bootstrap_inner self.run() File "/Users/laike9m/.pyenv/versions/3.9.0/lib/python3.9/threading.py", line 888, in run self._target(*self._args, **self._kwargs) File "/Users/laike9m/.pyenv/versions/[email protected]/lib/python3.9/site-packages/grpc/_server.py", line 875, in _serve if not _process_event_and_continue(state, event): File "/Users/laike9m/.pyenv/versions/[email protected]/lib/python3.9/site-packages/grpc/_server.py", line 839, in _process_event_and_continue rpc_state, rpc_future = _handle_call(event, state.generic_handlers, File "/Users/laike9m/.pyenv/versions/[email protected]/lib/python3.9/site-packages/grpc/_server.py", line 749, in _handle_call return _handle_with_method_handler(rpc_event, method_handler, File "/Users/laike9m/.pyenv/versions/[email protected]/lib/python3.9/site-packages/grpc/_server.py", line 722, in _handle_with_method_handler return state, _handle_unary_stream(rpc_event, state, File "/Users/laike9m/.pyenv/versions/[email protected]/lib/python3.9/site-packages/grpc/_server.py", line 640, in _handle_unary_stream return thread_pool.submit(_stream_response_in_pool, rpc_event, state, File "/Users/laike9m/.pyenv/versions/3.9.0/lib/python3.9/concurrent/futures/thread.py", line 163, in submit raise RuntimeError('cannot schedule new futures after ' RuntimeError: cannot schedule new futures after interpreter shutdown

which I believe is relevant. @aeros

@laike9m Hmm, does your code specifically rely on submitting new jobs after executor.shutdown() gets called? If so, would it be possible to move the executor.shutdown() to a later point (optimally, you should be done with using the executor before calling shutdown())?

That being said, there may be some reason to consider allowing users to opt-in to using daemon threads; particularly since the decision to prevent subinterpreters from using daemon threads was reverted (which was originally the main motivation for this PR, with the benefit of more predictable shutdown behavior).

Do you have any thoughts @pitrou?

@laike9m What do you mean with "break"? Does the process not exit cleanly?

Let me describe my use case.

I'm using gRPC which starts a server using ThreadPoolExecutor:
https://github.com/laike9m/Cyberbrain/blob/master/cyberbrain/rpc_server.py#L235

This server runs in another thread besides the main thread, and would wait using server.wait_for_termination() until users explicitly ctrl-c. The main thread would run and finish the work, clients may or may not connect to the server before the main thread end, and the requirement is for the server to respond even after the main thread has finished. In Python <=3.8 this works, but in 3.9 I got the above error.

I'm not 100% that the root cause is this change, but I scanned through all changes in concurrent.futures in 3.9, and this seems to be the most likely one.

I see, then I think you need to define a signal handler that will be called when Ctrl-C is pressed, and that will ask the server to stop gracefully.

emm, it's actually not about what happens when users press Ctrl+C, but the server is not able to serve requests before that...

server starts listening --> main thread finishes ---> requests come ------> Ctrl + C (problem is here)

I'll try to come up with a minimal example that can reproduce this error.

Ah, then it's simpler: you should ask the server to finish and wait for it to finish before you let the main thread exit.

Sounds good, thanks.

aeros · 2024-02-29T00:43:54Z

@gvanrossum this PR needs to be reverted, but I no longer have my commit bit

gpshead · 2024-02-29T01:18:26Z

This PR needs to be reverted, but I no longer have my commit bit

Please open a new issue with reasoning.

In general we do not want daemon threads anywhere anymore. Their existence has turned out to be a misfeature as they mean you cannot safely finalize a Python runtime when such threads are present.

Bluehorn · 2024-03-25T09:18:42Z

I am also curious why @aeros stated that this needs reverting. It looks like a backwards incompatible change (which is why I ended up reading this ticket), but it looks like affected projects can easily cope.

Did run into this because I manipulate interpreter shutdown for a test suite, which at times got stuck due to some non-daemon thread "helpfully" started in some library package. That code is not used in production but for diagnostics, as the failure was intermittent and difficult to reproduce.

In general we do not want daemon threads anywhere anymore.

I think nobody "wants" units of execution that are just killed without notice in whatever state they are. I'd just like to advise against dropping them without a replacement.

The use case I see is dealing with threads getting stuck in some native code. This came up in my career a few times and was always a giant PITA to solve (or, actually, work around). What I'd like is a way to detach a thread in a way that it may not come back from extension code and just silently stops. I am not sure if and how that could be implemented, though.

Sorry for the noise, got longer than anticipated.

pitrou · 2024-03-25T09:33:20Z

@Bluehorn Useful explanation, thank you. I think @gpshead 's statement "Please open a new issue with reasoning" still holds, though.

gvanrossum · 2024-03-25T15:44:00Z

I am also curious why @aeros stated that this needs reverting.

The leading theory is that he lost control over his account. That account's permissions have been dropped.

aeros added 2 commits March 24, 2020 20:18

Remove daemon threads in concurrent.futures

3f03bd7

Fix whitespace

7686ba5

the-knights-who-say-ni added the CLA signed label Mar 25, 2020

bedevere-bot added the awaiting review label Mar 25, 2020

aeros requested a review from pitrou March 25, 2020 00:31

📜🤖 Added by blurb_it.

21a12e9

aeros requested a review from brianquinlan March 25, 2020 00:36

aeros commented Mar 25, 2020

View reviewed changes

aeros added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 25, 2020

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 25, 2020

aeros mentioned this pull request Mar 25, 2020

Disable Windows 7 on the Python 3.x branch python/buildmaster-config#168

Closed

aeros mentioned this pull request Mar 25, 2020

Add type-behavior label for CPython PRs python/core-workflow#364

Closed

Fix typo in threading._register_atexit() docstring

b266ee5

pitrou reviewed Mar 25, 2020

View reviewed changes

aeros added the type-feature A feature request or enhancement label Mar 25, 2020

aeros added 3 commits March 25, 2020 18:07

Immediately initialize _threading_atexits_

320fc12

Disallow new atexits after shutdown

e548615

Add unit tests for _register_atexit()

fdf66c3

aeros closed this Mar 25, 2020

aeros reopened this Mar 25, 2020

pitrou reopened this Mar 27, 2020

Add whatsnew

4cf8c4e

aeros commented Mar 27, 2020

View reviewed changes

pitrou approved these changes Mar 27, 2020

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Mar 27, 2020

pitrou merged commit b61b818 into python:master Mar 27, 2020

bedevere-bot removed the awaiting merge label Mar 27, 2020

aeros deleted the bpo39812-cf-remove-daemon branch March 27, 2020 21:25

aeros restored the bpo39812-cf-remove-daemon branch March 30, 2020 23:34

aeros mentioned this pull request Apr 1, 2020

bpo-40115: Fix refleak in test_asyncio #19282

Merged

laike9m reviewed Oct 26, 2020

View reviewed changes

This was referenced Jul 7, 2021

nmap_tracker slowing down HA so much it is unusable home-assistant/core#52565

Closed

Fix deadlock at shutdown with python 3.9 home-assistant/core#52613

Merged

qingzhuozhen mentioned this pull request Jan 10, 2023

RuntimeError: cannot schedule new futures after interpreter shutdown for python 3.9+ amplitude/Amplitude-Python#38

Closed

haakonvt mentioned this pull request Jan 25, 2023

Fix thread shutdown on PY39+ (fix #1121) cognitedata/cognite-sdk-python#1122

Merged

4 tasks

gpshead mentioned this pull request Dec 14, 2023

Can't import concurrent.futures.thread at shutdown #113060

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-39812: Remove daemon threads in concurrent.futures #19149

bpo-39812: Remove daemon threads in concurrent.futures #19149

aeros commented Mar 25, 2020 •

edited by bedevere-bot

Loading

aeros Mar 25, 2020 •

edited

Loading

pitrou Mar 25, 2020

aeros Mar 25, 2020

aeros commented Mar 25, 2020

aeros Mar 25, 2020

bedevere-bot commented Mar 25, 2020

aeros commented Mar 25, 2020

aeros commented Mar 25, 2020

aeros commented Mar 25, 2020 •

edited

Loading

pitrou left a comment

pitrou Mar 25, 2020

aeros commented Mar 25, 2020 •

edited

Loading

aeros commented Mar 25, 2020

aeros commented Mar 25, 2020

pitrou commented Mar 27, 2020

pitrou commented Mar 27, 2020

pitrou commented Mar 27, 2020

aeros commented Mar 27, 2020 •

edited

Loading

aeros Mar 27, 2020

pitrou left a comment

laike9m Oct 26, 2020

aeros Oct 26, 2020 •

edited

Loading

pitrou Oct 26, 2020

laike9m Oct 26, 2020 •

edited

Loading

pitrou Oct 26, 2020

laike9m Oct 26, 2020

laike9m Oct 26, 2020

pitrou Oct 26, 2020

laike9m Oct 27, 2020

aeros commented Feb 29, 2024

gpshead commented Feb 29, 2024

Bluehorn commented Mar 25, 2024

pitrou commented Mar 25, 2024

gvanrossum commented Mar 25, 2024

		Removed daemon threads from :class:`~concurrent.futures.ThreadPoolExecutor`
		and :class:`~concurrent.futures.ProcessPoolExecutor`. This improves


		atexit.register(_python_exit)

bpo-39812: Remove daemon threads in concurrent.futures #19149

bpo-39812: Remove daemon threads in concurrent.futures #19149

Conversation

aeros commented Mar 25, 2020 • edited by bedevere-bot Loading

aeros Mar 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aeros commented Mar 25, 2020

Choose a reason for hiding this comment

bedevere-bot commented Mar 25, 2020

aeros commented Mar 25, 2020

aeros commented Mar 25, 2020

aeros commented Mar 25, 2020 • edited Loading

pitrou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aeros commented Mar 25, 2020 • edited Loading

aeros commented Mar 25, 2020

aeros commented Mar 25, 2020

pitrou commented Mar 27, 2020

pitrou commented Mar 27, 2020

pitrou commented Mar 27, 2020

aeros commented Mar 27, 2020 • edited Loading

Choose a reason for hiding this comment

pitrou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aeros Oct 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laike9m Oct 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aeros commented Feb 29, 2024

gpshead commented Feb 29, 2024

Bluehorn commented Mar 25, 2024

pitrou commented Mar 25, 2024

gvanrossum commented Mar 25, 2024

aeros commented Mar 25, 2020 •

edited by bedevere-bot

Loading

aeros Mar 25, 2020 •

edited

Loading

aeros commented Mar 25, 2020 •

edited

Loading

aeros commented Mar 25, 2020 •

edited

Loading

aeros commented Mar 27, 2020 •

edited

Loading

aeros Oct 26, 2020 •

edited

Loading

laike9m Oct 26, 2020 •

edited

Loading