Issue
I was wondering what exactly happens when we await
a coroutine in async Python code, for example:
await send_message(string)
(1) send_message
is added to the event loop, and the calling coroutine gives up control to the event loop, or
(2) We jump directly into send_message
Most explanations I read point to (1), as they describe the calling coroutine as exiting. But my own experiments suggest (2) is the case: I tried to have a coroutine run after the caller but before the callee and could not achieve this.
Solution
No, await
(per se) does not yield to the event loop, yield
yields to the event loop, hence for the case given: "(2) We jump directly into send_message
". In particular, certain yield
expressions are the only points, at bottom, where async tasks can actually be switched out (in terms of nailing down the precise spot where Python code execution can be suspended).
To be proven and demonstrated: 1) by theory/documentation, 2) by implementation code, 3) by example.
By theory/documentation
PEP 492: Coroutines with async
and await
syntax
While the PEP is not tied to any specific Event Loop implementation, it is relevant only to the kind of coroutine that uses
yield
as a signal to the scheduler, indicating that the coroutine will be waiting until an event (such as IO) is completed. ...[
await
] uses theyield from
implementation [with an extra step of validating its argument.] ...Any
yield from
chain of calls ends with ayield
. This is a fundamental mechanism of howFuture
s are implemented. Since, internally, coroutines are a special kind of generators, everyawait
is suspended by ayield
somewhere down the chain of await calls (please refer to PEP 3156 for a detailed explanation). ...Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, coroutines have
throw()
,send()
andclose()
methods. ...The vision behind existing generator-based coroutines and this proposal is to make it easy for users to see where the code might be suspended.
In context, "easy for users to see where the code might be suspended" seems to refer to the fact that in synchronous code yield
is the place where execution can be "suspended" within a routine allowing other code to run, and that principle now extends perfectly to the async context wherein a yield
(if its value is not consumed within the running task but is propagated up to the scheduler) is the "signal to the scheduler" to switch out tasks.
More succinctly: where does a generator yield control? At a yield
. Coroutines (including those using async
and await
syntax) are generators, hence likewise.
And it is not merely an analogy, in implementation (see below) the actual mechanism by which a task gets "into" and "out of" coroutines is not anything new, magical, or unique to the async world, but simply by calling the coro's <generator>.send()
method. That was (as I understand the text) part of the "vision" behind PEP 492: async
and await
would provide no novel mechanism for code suspension but just pour async-sugar on Python's already well-beloved and powerful generators.
And PEP 3156: The "asyncio" module
The
loop.slow_callback_duration attribute
controls the maximum execution time allowed between two yield points before a slow callback is reported [emphasis in original].
That is, an uninterrupted segment of code (from the async perspective) is demarcated as that between two successive yield
points (whose values reached up to the running Task
level (via an await
/yield from
tunnel) without being consumed within it).
And this:
The scheduler has no public interface. You interact with it by using
yield from future
andyield from task
.
Objection: "That says 'yield from
', but you're trying to argue that the task can only switch out at a yield
itself! yield from
and yield
are different things, my friend, and yield from
itself doesn't suspend code!"
Ans: Not a contradiction. The PEP is saying you interact with the scheduler by using yield from future/task
. But as noted above in PEP 492, any chain of yield from
(~aka await
) ultimately reaches a yield
(the "bottom turtle"). In particular (see below), yield from future
does in fact yield
that same future
after some wrapper work, and that yield
is the actual "switch out point" where another task takes over. But it is incorrect for your code to directly yield
a Future
up to the current Task
because you would bypass the necessary wrapper.
The objection having been answered, and its practical coding considerations being noted, the point I wish to make from the above quote remains: that a suitable yield
in Python async code is ultimately the one thing which, having suspended code execution in the standard way that any other yield
would do, now futher engages the scheduler to bring about a possible task switch.
By implementation code
class Future:
...
def __await__(self):
if not self.done():
self._asyncio_future_blocking = True
yield self # This tells Task to wait for completion.
if not self.done():
raise RuntimeError("await wasn't used with future")
return self.result() # May raise too.
__iter__ = __await__ # make compatible with 'yield from'.
Paraphrase: The line yield self
is what tells the running task to sit out for now and let other tasks run, coming back to this one sometime after self
is done.
Almost all of your awaitables in asyncio
world are (multiple layers of) wrappers around a Future
. The event loop remains utterly blind to all higher level await awaitable
expressions until the code execution trickles down to an await future
or yield from future
and then (as seen here) calls yield self
, which yielded self
is then "caught" by none other than the Task
under which the present coroutine stack is running thereby signaling to the task to take a break.
Possibly the one and only exception to the above "code suspends at yield self
within await future
" rule, in an asyncio
context, is the potential use of a bare yield
such as in asyncio.sleep(0)
. And since the sleep
function is a topic of discourse in the comments of this post, let's look at that.
@types.coroutine
def __sleep0():
"""Skip one event loop run cycle.
This is a private helper for 'asyncio.sleep()', used
when the 'delay' is set to 0. It uses a bare 'yield'
expression (which Task.__step knows how to handle)
instead of creating a Future object.
"""
yield
async def sleep(delay, result=None, *, loop=None):
"""Coroutine that completes after a given time (in seconds)."""
if delay <= 0:
await __sleep0()
return result
if loop is None:
loop = events.get_running_loop()
else:
warnings.warn("The loop argument is deprecated since Python 3.8, "
"and scheduled for removal in Python 3.10.",
DeprecationWarning, stacklevel=2)
future = loop.create_future()
h = loop.call_later(delay,
futures._set_result_unless_cancelled,
future, result)
try:
return await future
finally:
h.cancel()
Note: We have here the two interesting cases at which control can shift to the scheduler:
(1) The bare yield
in __sleep0
(when called via an await
).
(2) The yield self
immediately within await future
.
The crucial line (for our purposes) in asyncio/tasks.py is when Task._step
runs its top-level coroutine via result = self._coro.send(None)
and recognizes fourish cases:
(1) result = None
is generated by the coro (which, again, is a generator): the task "relinquishes control for one event loop iteration".
(2) result = future
is generated within the coro, with further magic member field evidence that the future was yielded in a proper manner from out of Future.__iter__ == Future.__await__
: the task relinquishes control to the event loop until the future is complete.
(3) A StopIteration
is raised by the coro indicating the coroutine completed (i.e. as a generator it exhausted all its yield
s): the final result of the task (which is itself a Future
) is set to the coroutine return value.
(4) Any other Exception
occurs: the task's set_exception
is set accordingly.
Modulo details, the main point for our concern is that coroutine segments in an asyncio
event loop ultimately run via coro.send()
. Initial startup and final termination aside, send()
proceeds precisely from the last yield
value it generated to the next one.
By example
import asyncio
import types
def task_print(s):
print(f"{asyncio.current_task().get_name()}: {s}")
async def other_task(s):
task_print(s)
class AwaitableCls:
def __await__(self):
task_print(" 'Jumped straight into' another `await`; the act of `await awaitable` *itself* doesn't 'pause' anything")
yield
task_print(" We're back to our awaitable object because that other task completed")
asyncio.create_task(other_task("The event loop gets control when `yield` points (from an iterable coroutine) propagate up to the `current_task` through a suitable chain of `await` or `yield from` statements"))
async def coro():
task_print(" 'Jumped straight into' coro; the `await` keyword itself does nothing to 'pause' the current_task")
await AwaitableCls()
task_print(" 'Jumped straight back into' coro; we have another pending task, but leaving an `__await__` doesn't 'pause' the task any more than entering the `__await__` does")
@types.coroutine
def iterable_coro(context):
task_print(f"`{context} iterable_coro`: pre-yield")
yield None # None or a Future object are the only legitimate yields to the task in asyncio
task_print(f"`{context} iterable_coro`: post-yield")
async def original_task():
asyncio.create_task(other_task("Aha, but a (suitably unconsumed) *`yield`* DOES 'pause' the current_task allowing the event scheduler to `_wakeup` another task"))
task_print("Original task")
await coro()
task_print("'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop")
res = await iterable_coro("await")
assert res is None
asyncio.create_task(other_task("This doesn't run until the very end because the generated None following the creation of this task is consumed by the `for` loop"))
for y in iterable_coro("for y in"):
task_print(f"But 'ordinary' `yield` points (those which are consumed by the `current_task` itself) behave as ordinary without relinquishing control at the async/task-level; `y={y}`")
task_print("Done with original task")
asyncio.get_event_loop().run_until_complete(original_task())
run in python3.8 produces
Task-1: Original task
Task-1: 'Jumped straight into' coro; the
await
keyword itself does nothing to 'pause' the current_taskTask-1: 'Jumped straight into' another
await
; the act ofawait awaitable
itself doesn't 'pause' anythingTask-2: Aha, but a (suitably unconsumed)
yield
DOES 'pause' the current_task allowing the event scheduler to_wakeup
another taskTask-1: We're back to our awaitable object because that other task completed
Task-1: 'Jumped straight back into' coro; we have another pending task, but leaving an
__await__
doesn't 'pause' the task any more than entering the__await__
doesTask-1: 'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop
Task-1:
await iterable_coro
: pre-yieldTask-3: The event loop gets control when
yield
points (from an iterable coroutine) propagate up to thecurrent_task
through a suitable chain ofawait
oryield from
statementsTask-1:
await iterable_coro
: post-yieldTask-1:
for y in iterable_coro
: pre-yieldTask-1: But 'ordinary'
yield
points (those which are consumed by thecurrent_task
itself) behave as ordinary without relinquishing control at the async/task-level;y=None
Task-1:
for y in iterable_coro
: post-yieldTask-1: Done with original task
Task-4: This doesn't run until the very end because the generated None following the creation of this task is consumed by the
for
loop
Indeed, exercises such as the following can help one's mind to decouple the functionality of async
/await
from notion of "event loops" and such. The former is conducive to nice implementations and usages of the latter, but you can use async
and await
just as specially syntaxed generator stuff without any "loop" (whether asyncio
or otherwise) whatsoever:
import types # no asyncio, nor any other loop framework
async def f1():
print(1)
print(await f2(),'= await f2()')
return 8
@types.coroutine
def f2():
print(2)
print((yield 3),'= yield 3')
return 7
class F3:
def __await__(self):
print(4)
print((yield 5),'= yield 5')
print(10)
return 11
task1 = f1()
task2 = F3().__await__()
""" You could say calls to send() represent our
"manual task management" in this script.
"""
print(task1.send(None), '= task1.send(None)')
print(task2.send(None), '= task2.send(None)')
try:
print(task1.send(6), 'try task1.send(6)')
except StopIteration as e:
print(e.value, '= except task1.send(6)')
try:
print(task2.send(9), 'try task2.send(9)')
except StopIteration as e:
print(e.value, '= except task2.send(9)')
produces
1
2
3 = task1.send(None)
4
5 = task2.send(None)
6 = yield 3
7 = await f2()
8 = except task1.send(6)
9 = yield 5
10
11 = except task2.send(9)
Answered By - Zach Harris
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.