Issue
It's my understanding that asyncio.gather
is intended to run its arguments concurrently and also that when a coroutine executes an await expression it provides an opportunity for the event loop to schedule other tasks. With that in mind, I was surprised to see that the following snippet ignores one of the inputs to asyncio.gather
.
import asyncio
async def aprint(s):
print(s)
async def forever(s):
while True:
await aprint(s)
async def main():
await asyncio.gather(forever('a'), forever('b'))
asyncio.run(main())
As I understand it, the following things happen:
- asyncio.run(main()) does any necessary global initialization of the event loop and schedules main() for execution.
- main() schedules asyncio.gather(...) for execution and waits for its result
- asyncio.gather schedules the executions of forever('a') and forever('b')
- whichever of the those executes first, they immediately await aprint() and give the scheduler the opportunity to run another coroutine if desired (e.g. if we start with 'a' then we have a chance to start trying to evaluate 'b', which should already be scheduled for execution).
- In the output we'll see a stream of lines each containing 'a' or 'b', and the scheduler ought to be fair enough that we see at least one of each over a long enough period of time.
In practice this isn't what I observe. Instead, the entire program is equivalent to while True: print('a')
. What I found extremely interesting is that even minor changes to the code seem to reintroduce fairness. E.g., if we instead have the following code then we get a roughly equal mix of 'a' and 'b' in the output.
async def forever(s):
while True:
await aprint(s)
await asyncio.sleep(1.)
Verifying that it doesn't seem to have anything to do with how long we spend in vs out of the infinite loop I found that the following change also provides fairness.
async def forever(s):
while True:
await aprint(s)
await asyncio.sleep(0.)
Does anyone know why this unfairness might happen and how to avoid it? I suppose when in doubt I could proactively add an empty sleep statement everywhere and hope that suffices, but it's incredibly non-obvious to me why the original code doesn't behave as expected.
In case it matters since asyncio seems to have gone through quite a few API changes, I'm using a vanilla installation of Python 3.8.4 on an Ubuntu box.
Solution
- whichever of the those executes first, they immediately
await aprint()
and give the scheduler the opportunity to run another coroutine if desired
This part is a common misconception. Python's await
doesn't mean "yield control to the event loop", it means "start executing the awaitable, allowing it to suspend us along with it". So yes, if the awaited object chooses to suspend, the current coroutine will suspend as well, and so will the coroutine that awaits it and so on, all the way to the event loop. But if the awaited object doesn't choose to suspend, as is the case with aprint
, neither will the coroutine that awaits it. This is occasionally a source of bugs, as seen here or here.
Does anyone know why this unfairness might happen and how to avoid it?
Fortunately this effect is most pronounced in toy examples that don't really communicate with the outside world. And although you can fix them by adding await asyncio.sleep(0)
to strategic places (which is even documented to force a context switch), you probably shouldn't do that in production code.
A real program will depend on input from the outside world, be it data coming from the network, from a local database, or from a work queue populated by another thread or process. Actual data will rarely arrive so fast to starve the rest of the program, and if it does, the starvation will likely be temporary because the program will eventually suspend due to backpressure from its output side. In the rare possibility that the program receives data from one source faster than it can process it, but still needs to observe data coming from another source, you could have a starvation issue, but that can be fixed with forced context switches if it is ever shown to occur. (I haven't heard of anyone encountering it in production.)
Aside from bugs mentioned above, what happens much more often is that a coroutine invokes CPU-heavy or legacy blocking code, and that ends up hogging the event loop. Such situations should be handled by passing the CPU/blocking part to run_in_executor
.
Answered By - user4815162342
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.