Issue
I cannot get both my consumer and my producer running at the same time, it seems worker(), or the aiohttp server are blocking - even when executed simultaneously with asyncio.gather()
If instead I do loop.create_task(worker), this will block and server will never be started.
I've tried every variation I can imagine, including nest_asyncio module - and I can only ever get one of the two components running.
What am I doing wrong?
async def worker():
batch_size = 30
print("running worker")
while True:
if queue.qsize() > 0:
future_map = {}
size = min(queue.qsize(), batch_size)
batch = []
for _ in range(size):
item = await queue.get()
print("Item: "+str(item))
future_map[item["fname"]] = item["future"]
batch.append(item)
print("processing", batch)
results = await process_files(batch)
for dic in results:
for key, value in dic.items():
print(str(key)+":"+str(value))
future_map[key].set_result(value)
# mark the tasks done
for _ in batch:
queue.task_done()
def start_worker():
loop.create_task(worker())
def create_app():
app = web.Application()
routes = web.RouteTableDef()
@routes.post("/decode")
async def handle_post(request):
return await decode(request)
app.add_routes(routes)
app.on_startup.append(start_worker())
return app
if __name__ == '__main__':
loop = asyncio.get_event_loop()
queue = asyncio.Queue()
app = create_app()
web.run_app(app)
The above prints "running worker" and does not start the AIOHTTP server.
def run(loop, app, port=8001):
handler = app.make_handler()
f = loop.create_server(handler, '0.0.0.0', port)
srv = loop.run_until_complete(f)
print('serving on', srv.sockets[0].getsockname())
try:
loop.run_forever()
except KeyboardInterrupt:
pass
finally:
loop.run_until_complete(handler.finish_connections(1.0))
srv.close()
loop.run_until_complete(srv.wait_closed())
loop.run_until_complete(app.finish())
loop.close()
def main(app):
asyncio.gather(run(loop, app), worker())
if __name__ == '__main__':
loop = asyncio.get_event_loop()
queue = asyncio.Queue()
app = create_app()
main(app)
The above starts server, but not the worker.
Solution
Although await asyncio.sleep(0)
fixes the immediate issue, it's not an ideal fix; in fact, it's somewhat of an anti-pattern. To understand why, let's examine why the problem occurs in more detail. The heart of the matter is the worker's while
loop - as soon as the queue becomes empty, it effectively boils down to:
while True:
pass
Sure, the part marked as pass
contains a check for qsize()
leading to execution of additional code if the queue is non-empty, but once qsize()
first reaches 0, that check will always evaluate to false. This is because asyncio is single-threaded and when qsize() == 0
the, the while
loop no longer encounters a single await
. Without await
, it's impossible to relinquish control to a coroutine or callback that might populate the queue, and the while
loop becomes infinite.
This is why await asyncio.sleep(0)
inside the loop helps: it forces a context switch, guaranteeing that other coroutines will get a chance to run and eventually re-populate the queue. However, it also keeps the while
loop constantly running, which means that the event loop will never go to sleep, even if the queue remains empty for hours on end. The event loop will remain in a busy-waiting state for as long as the worker is active. You could alleviate the busy-wait by adjusting the sleep interval to a non-zero value, as suggested by dirn, but that will introduce latency, and will still not allow the event loop to go to sleep when there's no activity.
The proper fix is to not check for qsize()
, but to use queue.get()
to get the next item. This will sleep as long as needed until the item appears, and immediately wake up the coroutine once it does. Don't worry that this will "block" the worker - it's precisely the point of asyncio that you can have multiple coroutines and that one being "blocked" on an await simply allows others to proceed. For example:
async def worker():
batch_size = 30
while True:
# wait for an item and add it to the batch
batch = [await queue.get()]
# batch up more items if available
while not queue.empty() and len(batch) < batch_size:
batch.append(await queue.get())
# process the batch
future_map = {item["fname"]: item["future"] for item in batch}
results = await process_files(batch)
for dic in results:
for key, value in dic.items():
print(str(key)+":"+str(value))
future_map[key].set_result(value)
for _ in batch:
queue.task_done()
In this variant we await for something in every iteration of the loop, and sleeping is not needed.
Answered By - user4815162342
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.