Thursday, February 8, 2024

[FIXED] Does `await` in Python yield to the event loop?

February 08, 2024 async-await, asynchronous, python, python-3.x, python-asyncio No comments

Issue

I was wondering what exactly happens when we await a coroutine in async Python code, for example:

await send_message(string)

(1) send_message is added to the event loop, and the calling coroutine gives up control to the event loop, or

(2) We jump directly into send_message

Most explanations I read point to (1), as they describe the calling coroutine as exiting. But my own experiments suggest (2) is the case: I tried to have a coroutine run after the caller but before the callee and could not achieve this.

Solution

No, await (per se) does not yield to the event loop, yield yields to the event loop, hence for the case given: "(2) We jump directly into send_message". In particular, certain yield expressions are the only points, at bottom, where async tasks can actually be switched out (in terms of nailing down the precise spot where Python code execution can be suspended).

To be proven and demonstrated: 1) by theory/documentation, 2) by implementation code, 3) by example.

By theory/documentation

PEP 492: Coroutines with async and await syntax

While the PEP is not tied to any specific Event Loop implementation, it is relevant only to the kind of coroutine that uses yield as a signal to the scheduler, indicating that the coroutine will be waiting until an event (such as IO) is completed. ...

[await] uses the yield from implementation [with an extra step of validating its argument.] ...

Any yield from chain of calls ends with a yield. This is a fundamental mechanism of how Futures are implemented. Since, internally, coroutines are a special kind of generators, every await is suspended by a yield somewhere down the chain of await calls (please refer to PEP 3156 for a detailed explanation). ...

Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, coroutines have throw(), send() and close() methods. ...

The vision behind existing generator-based coroutines and this proposal is to make it easy for users to see where the code might be suspended.

In context, "easy for users to see where the code might be suspended" seems to refer to the fact that in synchronous code yield is the place where execution can be "suspended" within a routine allowing other code to run, and that principle now extends perfectly to the async context wherein a yield (if its value is not consumed within the running task but is propagated up to the scheduler) is the "signal to the scheduler" to switch out tasks.

More succinctly: where does a generator yield control? At a yield. Coroutines (including those using async and await syntax) are generators, hence likewise.

And it is not merely an analogy, in implementation (see below) the actual mechanism by which a task gets "into" and "out of" coroutines is not anything new, magical, or unique to the async world, but simply by calling the coro's <generator>.send() method. That was (as I understand the text) part of the "vision" behind PEP 492: async and await would provide no novel mechanism for code suspension but just pour async-sugar on Python's already well-beloved and powerful generators.

And PEP 3156: The "asyncio" module

The loop.slow_callback_duration attribute controls the maximum execution time allowed between two yield points before a slow callback is reported [emphasis in original].

That is, an uninterrupted segment of code (from the async perspective) is demarcated as that between two successive yield points (whose values reached up to the running Task level (via an await/yield from tunnel) without being consumed within it).

And this:

The scheduler has no public interface. You interact with it by using yield from future and yield from task.

Objection: "That says 'yield from', but you're trying to argue that the task can only switch out at a yield itself! yield from and yield are different things, my friend, and yield from itself doesn't suspend code!"

Ans: Not a contradiction. The PEP is saying you interact with the scheduler by using yield from future/task. But as noted above in PEP 492, any chain of yield from (~aka await) ultimately reaches a yield (the "bottom turtle"). In particular (see below), yield from future does in fact yield that same future after some wrapper work, and that yield is the actual "switch out point" where another task takes over. But it is incorrect for your code to directly yield a Future up to the current Task because you would bypass the necessary wrapper.

The objection having been answered, and its practical coding considerations being noted, the point I wish to make from the above quote remains: that a suitable yield in Python async code is ultimately the one thing which, having suspended code execution in the standard way that any other yield would do, now futher engages the scheduler to bring about a possible task switch.

By implementation code

asyncio/futures.py

class Future:
...
    def __await__(self):
        if not self.done():
            self._asyncio_future_blocking = True
            yield self  # This tells Task to wait for completion.
        if not self.done():
            raise RuntimeError("await wasn't used with future")
        return self.result()  # May raise too.

    __iter__ = __await__  # make compatible with 'yield from'.

Paraphrase: The line yield self is what tells the running task to sit out for now and let other tasks run, coming back to this one sometime after self is done.

Almost all of your awaitables in asyncio world are (multiple layers of) wrappers around a Future. The event loop remains utterly blind to all higher level await awaitable expressions until the code execution trickles down to an await future or yield from future and then (as seen here) calls yield self, which yielded self is then "caught" by none other than the Task under which the present coroutine stack is running thereby signaling to the task to take a break.

Possibly the one and only exception to the above "code suspends at yield self within await future" rule, in an asyncio context, is the potential use of a bare yield such as in asyncio.sleep(0). And since the sleep function is a topic of discourse in the comments of this post, let's look at that.

asyncio/tasks.py

@types.coroutine
def __sleep0():
    """Skip one event loop run cycle.
    This is a private helper for 'asyncio.sleep()', used
    when the 'delay' is set to 0.  It uses a bare 'yield'
    expression (which Task.__step knows how to handle)
    instead of creating a Future object.
    """
    yield


async def sleep(delay, result=None, *, loop=None):
    """Coroutine that completes after a given time (in seconds)."""
    if delay <= 0:
        await __sleep0()
        return result

    if loop is None:
        loop = events.get_running_loop()
    else:
        warnings.warn("The loop argument is deprecated since Python 3.8, "
                      "and scheduled for removal in Python 3.10.",
                      DeprecationWarning, stacklevel=2)

    future = loop.create_future()
    h = loop.call_later(delay,
                        futures._set_result_unless_cancelled,
                        future, result)
    try:
        return await future

    finally:
        h.cancel()

Note: We have here the two interesting cases at which control can shift to the scheduler:

(1) The bare yield in __sleep0 (when called via an await).

(2) The yield self immediately within await future.

The crucial line (for our purposes) in asyncio/tasks.py is when Task._step runs its top-level coroutine via result = self._coro.send(None) and recognizes fourish cases:

(1) result = None is generated by the coro (which, again, is a generator): the task "relinquishes control for one event loop iteration".

(2) result = future is generated within the coro, with further magic member field evidence that the future was yielded in a proper manner from out of Future.__iter__ == Future.__await__: the task relinquishes control to the event loop until the future is complete.

(3) A StopIteration is raised by the coro indicating the coroutine completed (i.e. as a generator it exhausted all its yields): the final result of the task (which is itself a Future) is set to the coroutine return value.

(4) Any other Exception occurs: the task's set_exception is set accordingly.

Modulo details, the main point for our concern is that coroutine segments in an asyncio event loop ultimately run via coro.send(). Initial startup and final termination aside, send() proceeds precisely from the last yield value it generated to the next one.

By example

import asyncio
import types

def task_print(s):
    print(f"{asyncio.current_task().get_name()}: {s}")

async def other_task(s):
    task_print(s)

class AwaitableCls:
    def __await__(self):
        task_print("    'Jumped straight into' another `await`; the act of `await awaitable` *itself* doesn't 'pause' anything")
        yield
        task_print("    We're back to our awaitable object because that other task completed")
        asyncio.create_task(other_task("The event loop gets control when `yield` points (from an iterable coroutine) propagate up to the `current_task` through a suitable chain of `await` or `yield from` statements"))

async def coro():
    task_print("  'Jumped straight into' coro; the `await` keyword itself does nothing to 'pause' the current_task")
    await AwaitableCls()
    task_print("  'Jumped straight back into' coro; we have another pending task, but leaving an `__await__` doesn't 'pause' the task any more than entering the `__await__` does")

@types.coroutine
def iterable_coro(context):
    task_print(f"`{context} iterable_coro`: pre-yield")
    yield None # None or a Future object are the only legitimate yields to the task in asyncio
    task_print(f"`{context} iterable_coro`: post-yield")

async def original_task():
    asyncio.create_task(other_task("Aha, but a (suitably unconsumed) *`yield`* DOES 'pause' the current_task allowing the event scheduler to `_wakeup` another task"))

    task_print("Original task")
    await coro()
    task_print("'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop")
    res = await iterable_coro("await")
    assert res is None
    asyncio.create_task(other_task("This doesn't run until the very end because the generated None following the creation of this task is consumed by the `for` loop"))
    for y in iterable_coro("for y in"):
        task_print(f"But 'ordinary' `yield` points (those which are consumed by the `current_task` itself) behave as ordinary without relinquishing control at the async/task-level; `y={y}`")
    task_print("Done with original task")

asyncio.get_event_loop().run_until_complete(original_task())

run in python3.8 produces

Task-1: Original task

Task-1: 'Jumped straight into' coro; the await keyword itself does nothing to 'pause' the current_task

Task-1: 'Jumped straight into' another await; the act of await awaitable itself doesn't 'pause' anything

Task-2: Aha, but a (suitably unconsumed) yield DOES 'pause' the current_task allowing the event scheduler to _wakeup another task

Task-1: We're back to our awaitable object because that other task completed

Task-1: 'Jumped straight back into' coro; we have another pending task, but leaving an __await__ doesn't 'pause' the task any more than entering the __await__ does

Task-1: 'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop

Task-1: await iterable_coro: pre-yield

Task-3: The event loop gets control when yield points (from an iterable coroutine) propagate up to the current_task through a suitable chain of await or yield from statements

Task-1: await iterable_coro: post-yield

Task-1: for y in iterable_coro: pre-yield

Task-1: But 'ordinary' yield points (those which are consumed by the current_task itself) behave as ordinary without relinquishing control at the async/task-level; y=None

Task-1: for y in iterable_coro: post-yield

Task-1: Done with original task

Task-4: This doesn't run until the very end because the generated None following the creation of this task is consumed by the for loop

Indeed, exercises such as the following can help one's mind to decouple the functionality of async/await from notion of "event loops" and such. The former is conducive to nice implementations and usages of the latter, but you can use async and await just as specially syntaxed generator stuff without any "loop" (whether asyncio or otherwise) whatsoever:

import types # no asyncio, nor any other loop framework

async def f1():
    print(1)
    print(await f2(),'= await f2()')
    return 8

@types.coroutine
def f2():
    print(2)
    print((yield 3),'= yield 3')
    return 7

class F3:
   def __await__(self):
        print(4)
        print((yield 5),'= yield 5')
        print(10)
        return 11

task1 = f1()
task2 = F3().__await__()
""" You could say calls to send() represent our
   "manual task management" in this script.
"""
print(task1.send(None), '= task1.send(None)')
print(task2.send(None), '= task2.send(None)')
try:
    print(task1.send(6), 'try task1.send(6)')
except StopIteration as e:
    print(e.value, '= except task1.send(6)')
try:
    print(task2.send(9), 'try task2.send(9)')
except StopIteration as e:
    print(e.value, '= except task2.send(9)')

produces

1

2

3 = task1.send(None)

4

5 = task2.send(None)

6 = yield 3

7 = await f2()

8 = except task1.send(6)

9 = yield 5

10

11 = except task2.send(9)

Answered By - Zach Harris

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, February 8, 2024

[FIXED] Does `await` in Python yield to the event loop?

Issue

Solution

By theory/documentation

By implementation code

By example

0 comments:

Post a Comment

Popular Posts

Labels