Issue
I'm watching a video(a) on YouTube about asyncio
and, at one point, code like the following is presented for efficiently handling multiple HTTP requests:
# Need an event loop for doing this.
loop = asyncio.get_event_loop()
# Task creation section.
tasks = []
for n in range(1, 50):
tasks.append(loop.create_task(get_html(f"https://example.com/things?id={n}")))
# Task processing section.
for task in tasks:
html = await task
thing = get_thing_from_html(html)
print(f"Thing found: {thing}", flush=True)
I realise that this is efficient in the sense that everything runs concurrently but what concerns me is a case like:
- the first task taking a full minute; but
- all the others finishing in under three seconds.
Because the task processing section awaits completion of the tasks in the order in which they entered the list, it appears to me that none will be reported as complete until the first one completes.
At that point, the others that finished long ago will also be reported. Is my understanding correct?
If so, what is the normal way to handle that scenario, so that you're getting completion notification for each task the instant that task finishes?
(a) From Michael Kennedy of "Talk Python To Me" podcast fame. The video is Demystifying Python's Async and Await Keywords if you're interested. I have no affiliation with the site other than enjoying the podcast, so heartily recommend it.
Solution
If you just need to do something after each task, you can create another async function that does it, and run those in parallel:
async def wrapped_get_html(url):
html = await get_html(url)
thing = get_thing_from_html(html)
print(f"Thing found: {thing}")
async def main():
# shorthand for creating tasks and awaiting them all
await asyncio.gather(*
[wrapped_get_html(f"https://example.com/things?id={n}")
for n in range(50)])
asyncio.run(main())
If for some reason you need your main loop to be notified, you can do that with as_completed
:
async def main():
for next_done in asyncio.as_completed([
get_html(f"https://example.com/things?id={n}")
for n in range(50)]):
html = await next_done
thing = get_thing_from_html(html)
print(f"Thing found: {thing}")
asyncio.run(main())
Answered By - user4815162342
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.