Skip to content

Asyncio: Send arbitrary exceptions to Tasks #102847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kristjanvalur opened this issue Mar 20, 2023 · 9 comments
Closed

Asyncio: Send arbitrary exceptions to Tasks #102847

kristjanvalur opened this issue Mar 20, 2023 · 9 comments
Labels
topic-asyncio type-feature A feature request or enhancement

Comments

@kristjanvalur
Copy link
Contributor

Feature or enhancement

I propose that it becomes possible to raise arbitrary expressions on Task objects, as well as provide custom instances
of asyncio.CancelledError to `Task.cancel()

Pitch

Currently the only way to "signal" a Task is by means of the Task.cancel() method. It will cause a CancelledError
to be raised in the task in due course, and it can accept a special msg:str argument.

However, the primary usage of this mechanism is to actually cancel tasks. This has been cemented into place with
the addition of the TaskGroup construct, and various code put in place to ensure that Task.cancel() results in task
cancellation. This includes the "cancel count" on each task.

Sending "signals" (Exceptions) to task objects shouldn't be considered a very exotic operation. The new asyncio.Timeout context manager does that, when a timeout expires. However, it needs to co-opt the CancelledError to do so. This becomes problematic because it needs
to distinguish between proper cancellations and timeouts. The implementation would be much cleaner if it could simply raise its own subclass of BaseException, with its own custom arguments.

Task.cancel() is not an async method, meaning that cancellation is only scheduled to happen at a later point. (In a sense, it is asynchronous.)
Other frameworks, such as stackless have similar, but synchronous mechanisms to deliver signals. This would avoid race conditions where an application tries to send more than one exception to a task. Only one can be delivered, which one wins?

If we consider immediate exception delivery to hard to achieve in the asyncio framework, we could instead decide on a
policy, such as "last exception wins", but wth "CancelledError" given special priority.

Here is a small list of suggested apis just to discuss this.

def cancel(msg:string="", exception:Optional[CancelledError]=None):
    """
    Schedule a cancelled exception to be raised on the task.  If `exception` is provided,
    raise the given instance.  `msg` is deprecated in favor or providing as an argument to the `exception` constructor.
    """
    
async def acancel(...):
   """
   same as `cancel()` except it causes an immediate switch to the task.
   """

def raise(exception:BaseException):
   """
   Schedule `exception` to be raised on the task.  `exception must not be a `CancelledError` or 
   a `RuntimeError` will be thrown.
   """
   
async def araise(..):
    """
    same as `raise()` except that the exception is delivered immediately.
    """

Previous discussion

This was started in issue #102780

@kristjanvalur kristjanvalur added the type-feature A feature request or enhancement label Mar 20, 2023
@AlexWaygood AlexWaygood changed the title Send arbitrary exceptions to Tasks Asyncio: Send arbitrary exceptions to Tasks Mar 20, 2023
@github-project-automation github-project-automation bot moved this to Todo in asyncio Mar 20, 2023
@kristjanvalur
Copy link
Contributor Author

Since I mentioned synchronous event delivery, maybe this is as good a spot as any to plug my little async library, asynkit. It does, among other things, provide tools to control task switching by subclassing the EventLoop, such as task_switch()
one could then do something like:

async def acancel(self, msg):
     self.cancel(msg)
     await asynkit.task_switch(self)

https://github.com/kristjanvalur/py-asynkit

@gvanrossum
Copy link
Member

(We can't have a method named raise(), let's pretend you said throw().)

It would not be very hard to add this (at least the version that doesn't block until the exception is delivered), but I'm not sure I like the prospect of doing await asock.recv() and getting an OSError exception that wasn't raised by asock.recv(). I'd have to understand more about the scenarios and use cases before considering it.

I'd need even more convincing for acancel() and athrow() -- for starters, more use cases / scenarios, and I'm not sure how those would be implemented (I guess it could be done by adding a done_callback for the task being canceled).

I'm also not convinced that acancel() avoids race conditions, since there could be multiple tasks that all try to cancel the same "victim" task. Scenario please? (I'm also not sure that experience from stackless transfers easily to asyncio. Does stackless even have the concept of await points?)

@kristjanvalur
Copy link
Contributor Author

yes, reserved keywords :)

So, some free thoughts here, no scenarios yet, just thinking out loud a bit. Hope you don't mind.

I think that what I'm suggesting is not something that should be generally employed by application-layer programmers, but having the mechanism in place for specialized low-level programming would be beneficial. The only real cases I can think of are BaseException cases, where we need to interrupt a Task for some reason, not to inject random IOErrors.

I myself am also not sure that an async version of cancel is a patent solution to race conditions. It is just something I'm suggesting based on experience in my old stackless life, I recall that it was simpler to reason about task state by knowing that no other tasks would get in the way once you had decided what state change to perform.

The way I implement asynkit.task_switch() is by direct manipulation of the _ready queue. I find the target task in the queue, move it to the front, and perform a asyncio.sleep(0). (I've been experimenting a lot with custom scheduling algorithms and would like to see asyncio provide better hooks for that than this, but that's a different topic). This is only possible for a task already on the _ready queue. I'm not sure (I haven't checked) how cancel() works for a task which is on the _ready queue, if it can override the result from the future or if it simply sets a pending cancel flag (probably) which will trigger the exception on the next context switch.

Thinking about this, there is no way to synchronously and non-destructively send an exception to a runnable task, because that would overwrite the result that has already been delivered to it (e.g. a read() result) and then, the task wouldn't be able to, for example, re-try. Data has been lost. (stackless could do it, but it was a deliberately low level. someone sending an exception was supposed to know what he was doing, and also could query the state of the target tasklet)
So, there are two cases:

  • the target is still waiting for the Future
  • The target has finished waiting for the Future and is on the _ready queue to be activated.

In the first case, it is possible to immediately deliver an exception, put the task on the ready _queue, and optionally make it the next to run (pausing the caller, i.e. switching to it).
In the second case, an exception cannot be delivered until the next time the task awaits a future, so, pending cancels do seem necessary. For Timeouts in particular, which guard a block, a timeout must still be delivered to the task...

(asynkit has task_is_runnable() and task_is_blocked() helpers to figure those things out :))

I guess I need to dig better into Task.cancel() and how it works to come up with specific scenarios. I'll imagine that I'm recreating the Timeout manager from scratch...

@dimaqq
Copy link
Contributor

dimaqq commented Mar 30, 2023

User / non-core developer view, please take with a grain of salt.

~

Immediate switch to target task is weird, what if someone does gather(a.araise(), b.araise())?

I think being able to "[async-atomically] raise in a task and await that task" is a sufficient guarantee.

Ultimately, it's always possible for yet another .cancel() or .araise() to be called from yet another task, as the original use race condition was hit when the target task had async exception handling block. If a task chooses to ignore cancellation (or a custom exception), I think that ultimately all bets are off, yet async cleanup is a real use case.

~

What I'm thinking is that CancelledError should not have been naively co-opted by timeout and task_group context manager implementations.

Would it be enough to pass a specific instance of CancelledError instead? The context manager can then catch CancelledError and check if it's the manager's instance.

I think that would work in the case of nested timeout context managers, where distinguishing "own" exception by type would not be sufficient.

cc @ojii

@kristjanvalur
Copy link
Contributor Author

kristjanvalur commented Apr 3, 2023

So, I took a look at task.cancel() and realized how I was confused about how it is working.

As background information, where I'm (originally) coming from: In stackless, we would use channels for most of tasklet synchronization. Including IO, where the requested data would be "sent" to the tasklet. It could then either be immediately switched to, or, more commonly, put on the ready queue, with the "result" stored in a "tempval" on the tasklet object itself. If, at this time, someone decided to send an exception to it, it would lose that tempval, to be replaced with the exception object. This made timeouts a bit tricky, basically, we had to make sure to not timeout a tasklet which was runnable. postpone the timeout, as it were.

Anyway: In CPython, cancels are done as follows (ignoring details)

  1. Any pending future is cancelled.
  2. The target task is now on the ready queue (or in asyncio parlance, __step is scheduled).
  3. A boolean is set on the task.
  4. Some time passes. Other tasks run, maybe. Someone else may call cancel() again!
  5. __step will send the exception into the paused coroutine, instead of a None.
  6. The task wakes up with an exception. Any pending data is still pending an it can retry.

Now, the above is all fine and good if all we want to do is send CancelledError exceptions. But IMHO, step 4 is troublesome. We are interacting directly with a paused Task, expecting to know something about its state, yet the results aren't immediate. If we allow different kinds of exceptions, or different exception instances, to be sent in this interrim period, everything becomes complicated.

Already with asynkit, we can do task.canccel(); await asynkit.task_switch(task) and be certain that nothing will come between us sending the cancel message, and the task getting the exception.

I realize that the case for a "ready queue" and "task scheduling" is a bit alien to asyncio, since it is designed around scheduling abstract callbacks, not tasks. But immediate task switching has the benefit that one can reason about program logic and is one of the benefits of cooperative multi-tasking.

So, in short, I think that if we want to send arbitrary exceptions to tasks (as an advanced tool) it would be simplest to do so via some api like this:

async def araise(self, exception: BaseException)->None:
   """
   Arrange for `exception` to be sent to the task.  The target task
   is immediately scheduled.  The target task must belong to the
   same event loop as the caller, otherwise a RuntimeError is raised.
   """

...

Immediate switch to target task is weird, what if someone does gather(a.araise(), b.araise())

Both of these will be wrapped in their own tasks and executed. Nothing interesting will happen.

There is nothing inherently weird about immediate delivery. Think of it more like async def close(). It cancels the Task, and is also a yield point. In addition, rather than knowing the task cancels sometime in the future, it will be the next task to run. That could normally happen on its own, only here, we would guarantee it. So, it is like task.cancel(); asyncio.sleep(0), only with that extra guarantee about when task wakes up.

I think that would work in the case of nested timeout context managers, where distinguishing "own" exception by type would not be sufficient.

Indeed, this is exactly how I think such things should be implemented. In fact, I would suggest a different BaseException, such as a BaseTimeoutError, to be used for timeout delivery.

The problem is, in the absence of immediate delivery of an exception, that there are all kinds of complicated race situations possible. What if the CancelledError, or BaseTimeoutError, has been scheduled, but then someone else schedules another exception? Which one wins? One of the two instances must be discarded. Can we even make sense out of such behaviour?

This is why the current cancel() implementation has a cancel counter. And why there is a separate "message" present with the cancel exception. With immediate exception delivery, there never is a chance for this to happen.

@gvanrossum
Copy link
Member

It sounds to me that it isn't worth changing cancel semantics at this point, even if your way is better -- existing code depending on the current semantics would break and there's already enough of an ecosystem around asyncio that I'm not comfortable with that.

Maybe you'll find a more sympathetic ear in the Trio community?

@dimaqq
Copy link
Contributor

dimaqq commented Apr 4, 2023

I was going to suggest [Base]ExceptionGroup to accumulate several cancellations while a task is blocked/pending.

However BaseExceptionGroup(..., [CancelledError("a"), CancelledError("b"), ...]) would be a breaking change.

The next step would be class Ccc(CancelledError, BaseExceptionGroup): pass shim which is messy.

Maybe reparenting CancelledError under BaseExceptionGroup is worth consideration?

P.S. IMHO, the immediate semantics are orthogonal to the issue of multiple exceptions pending for a task.

@kristjanvalur
Copy link
Contributor Author

It sounds to me that it isn't worth changing cancel semantics at this point, even if your way is better -

Yes, this is "Realpolitik" in action :).

I just also realized a few additional points that I think I should note here, for reference:

  • that waiting for delivery would also resolve the issue that cancelling a Task that hasn't started yet will not allow it to start at all... Which I believe was raised somewhere.
  • Such a "synchronous" exception delivery would mean that we wouldn't propagate the cancel to an awaited sub-task. I remember when I ran across that behaviour ,and the shield() construct, for the first time.
  • Instead of immediate switching, we could avoid multiple tasks sending exceptions to a task via locking. A task could be locked for exception delivery until it has woken up and received the signal. While messier and requiring more primitives on the task, it would not require any changes/enhancements to scheduling.

P.S. IMHO, the immediate semantics are orthogonal to the issue of multiple exceptions pending for a task.

Not really, no, because the immediate semantics mean that there can be no multiple pending exceptions. If signal delivery is immediate, it is no longer a problem.

This is the only place I know of, in asyncio, where it is impossible to provide synchronization between two tasks communicating (because cancelling a task, or sending it an exception, is that.).

Anyway, Maybe I can cook up a test implementation. If not for cancel, then for any other exception. Perhaps it is even doable via "asynkit" although it would require deep patching of asyncio.

I'll close this for now, and thank you for your thougts :)

@kristjanvalur
Copy link
Contributor Author

kristjanvalur commented Oct 14, 2023

Oh, I now have an experimental implementation of this in asynkit.

It is experimental because there are corner cases which I cannot handle for C-implemented Task objects. Python-implemented Tasks work.

It has taken this long because asynkit is a hobby project, and I wanted to clean up other aspects first, foremost my scheduling extensions and not having to rely on special subclassing of the event loop to accomplish those.

Basically, a task_throw() and task_interupt() are implemented. They specifically don't interfere with Task.cancel(), since cancel has those special cancellation semantics of cancelling a whole stack of running Tasks and is impossible to do in an atomic way. But an exception can be delivered atomically and proves to be useful, at least for implementing timeout. There may be other use cases.

Of course, I'd like to see my scheduling extenstions to be considered upstream, particularly ways to modify the event loop queue, look up tasks in the queue, and so on. I have ideas about how that could be done better. My AbstractSchedulingLoop interface is an example of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-asyncio type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

4 participants