Issue
I am using Python 3.10 and I am a bit confused about asyncio.create_task
.
In the following example code, the functions are executed in coroutines whether or not I use asyncio.create_task
. It seems that there is no difference.
How can I determine when to use asyncio.create_task
and what are the advantages of using asyncio.create_task
compared to without it?
import asyncio
from asyncio import sleep
async def process(index: int):
await sleep(1)
print('ok:', index)
async def main1():
tasks = []
for item in range(10):
tasks.append(asyncio.create_task(process(item)))
await asyncio.gather(*tasks)
async def main2():
tasks = []
for item in range(10):
tasks.append(process(item)) # Without asyncio.create_task
await asyncio.gather(*tasks)
asyncio.run(main1())
asyncio.run(main2())
Solution
TL;DR
It makes sense to use create_task
, if you want to schedule the execution of that coroutine immediately, but not necessarily wait for it to finish, instead moving on to something else first.
Explanation
As has been pointed out in the comments already, asyncio.gather
itself wraps the provided awaitables in tasks, which is why it is essentially redundant to call create_task
on them beforehand in your simple example.
From the gather
docs:
If any awaitable [...] is a coroutine, it is automatically scheduled as a Task.
That being said, the two examples you constructed are not equivalent!
When you call create_task
, the Task is immediately scheduled for execution on the even loop. This means, if a context switch takes place after you called create_task
for all your coroutines (as in your first example), any number of them may immediately start executing, without you having to await
them explicitly.
From the create_task
docs: (my emphasis)
Wrap the [...] coroutine into a
Task
and schedule its execution.
By contrast, when you simply create the coroutines (as in your second example), they will not begin execution by themselves, unless you somehow schedule their execution (e.g. by simply await
ing them).
You can see this in action, if you add any await
(e.g. asyncio.sleep
) between creation and the gather
call and a few helpful print
statements:
from asyncio import create_task, gather, sleep, run
async def process(index: int):
await sleep(.5)
print('ok:', index)
async def create_tasks_then_gather():
tasks = [create_task(process(item)) for item in range(5)]
print("tasks scheduled")
await sleep(2) # <-- because of this `await` the tasks may begin to execute
print("now gathering tasks")
await gather(*tasks)
print("gathered tasks")
async def create_coroutines_then_gather():
coroutines = [process(item) for item in range(5)]
print("coroutines created")
await sleep(2) # <-- despite this, the coroutines will not begin execution
print("now gathering coroutines")
await gather(*coroutines)
print("gathered coroutines")
run(create_tasks_then_gather())
run(create_coroutines_then_gather())
Output:
tasks scheduled
ok: 0
ok: 1
ok: 2
ok: 3
ok: 4
now gathering tasks
gathered tasks
coroutines created
now gathering coroutines
ok: 0
ok: 1
ok: 2
ok: 3
ok: 4
gathered coroutines
As you can see, in create_tasks_then_gather
the process
body was executed before the gather
call, whereas in create_coroutines_then_gather
it was executed only after.
Therefore, whether or not using create_task
is useful depends on the situation. If you only care about the coroutines being executed concurrently and awaited at that particular point in your code, there is no use in calling create_task
. If you want to schedule them, but then move on to something else, while they may or may not do their thing in the background, it makes sense to use create_task
.
One important thing to remember however is that you can only ever be sure that the tasks you scheduled actually execute completely, if you at some point await
them. This is why you still should await gather
them (or equivalent) to actually wait for them to finish eventually.
Answered By - Daniil Fajnberg
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.