Skip to content

gh-86128: Add warning to ThreadPoolExecutor docs #94008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Doc/library/concurrent.futures.rst
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,13 @@ And::
An :class:`Executor` subclass that uses a pool of at most *max_workers*
threads to execute calls asynchronously.

All threads enqueued to ``ThreadPoolExecutor`` will be joined before the
interpreter can exit. Note that the exit handler which does this is
executed *before* any exit handlers added using `atexit`. This means
exceptions in the main thread must be caught and handled in order to
signal threads to exit gracefully. For this reason, it is recommended
that ``ThreadPoolExecutor`` not be used for long-running tasks.
Copy link
Contributor

@graingert graingert Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still safe ways to use a ThreadPoolExecutor for long running tasks:

  • as long as you use the ThreadPoolExecutor context manager,
  • keep hold of a function to promptly abort the long-running tasks,
  • and arrange for that function to be called before leaving the ThreadPoolExecutor context manager

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but it's only a recommendation to not use it, which means if you know what you're doing then you can ignore the recommendation 😉 The docs already explain what conditions have to be met to make it safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@graingert It's discussed further in the issue I linked to, see @ericsnowcurrently's comment here: #86128 (comment)

Having dived into the source code a fair amount while debugging the issue I ran into, you're right that it can be used for long-running tasks. However one common way of handling graceful exit does not work, in a way that was so surprising that it took me hours to debug. Using a context manager would have required refactoring the code, while using threading.Thread directly did not. So I think it's worth a warning.

Note that none of the examples in the docs handle this tricky case in the way you suggest. That might be a good addition to the docs. :-)

Copy link
Contributor

@graingert graingert Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd definitely like to add this as an example/recipe.

My use case is, I'm using ThreadPoolExecutor as a synchronous structured concurrency primative where I manage cancellation myself.

This is in distributed.utils_test.loop_in_thread and anyio's BlockingPortal

My qualms about adding this as an example to the stdlib is that to do this I'm using c.f.Future in a way that's documented as incorrect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a really fuzzy line between "it is recommended that ThreadPoolExecutor not be used for long-running tasks because of this tricky case" and "this tricky case means that to use ThreadPoolExecutor for long-running tasks, you should do it like this ..." It seems like a small change to me if you add the example.

I was basing the recommendation off of (1) a core contributor saying that it's not a good fit for long-running tasks, and (2) my personal experience with confusing issues in using it for long-running tasks. I tend to agree with @zooba that a recommendation against is not the same as saying that it's incorrect. (As a contrived example, you might recommend against using cpython for tasks that cannot handle gc pauses, but someone might do it anyway and call gc.disable() to avoid the problem. That can cause other, well-documented problems, but presumably they know what they're doing.)


*initializer* is an optional callable that is called at the start of
each worker thread; *initargs* is a tuple of arguments passed to the
initializer. Should *initializer* raise an exception, all currently
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Document a limitation in ThreadPoolExecutor where its exit handler is executed before any handlers in atexit.