Issue
Update: asyncio simply does what it's told and you can handle these exceptions just fine - see my follow-up answer that I've marked as the solution to this question. Original question below, with slightly modified example to clarify the issue and its solution.
I've been trying to debug a library that I'm working on that relies heavily on asyncio. While working on some example code, I realised that performing a keyboard interrupt (CTRL-C) sometimes (rarely!) triggered the dreaded...
Task exception was never retrieved
I've tried hard to make sure that all tasks that I spin off handle asyncio.CancelledError
gracefully, and after having spent way too many hours debugging this I realised that I only end up with this error message if one of the asyncio tasks is stuck on a blocking operation.
Blocking? You really shouldn't perform blocking work in tasks - that's why asyncio is kind enough to warn you about this. Run the below code...
import asyncio
from time import sleep
async def possibly_dangerous_sleep(i: int, use_blocking_sleep: bool = True):
try:
print(f"Sleep #{i}: Fine to cancel me within the next 2 seconds")
await asyncio.sleep(2)
if use_blocking_sleep:
print(
f"Sleep #{i}: Not fine to cancel me within the next 10 seconds UNLESS someone is"
" awaiting me, e.g. asyncio.gather()"
)
sleep(10)
else:
print(f"Sleep #{i}: Will sleep using asyncio.sleep(), nothing to see here")
await asyncio.sleep(10)
print(f"Sleep #{i}: Fine to cancel me now")
await asyncio.sleep(2)
except asyncio.CancelledError:
print(f"Sleep #{i}: So, I got cancelled...")
raise
def done_cb(task: asyncio.Task):
name = task.get_name()
try:
task.exception()
except asyncio.CancelledError:
print(f"Done: Task {name} was cancelled")
pass
except Exception as e:
print(f"Done: Task {name} didn't handle exception { e }")
else:
print(f"Done: Task {name} is simply done")
async def start_doing_stuff(collect_exceptions_when_gathering: bool = False):
tasks = []
for i in range(1, 7):
task = asyncio.create_task(
possibly_dangerous_sleep(i, use_blocking_sleep=True), name=str(i)
)
task.add_done_callback(done_cb)
tasks.append(task)
# await asyncio.sleep(3600)
results = await asyncio.gather(*tasks, return_exceptions=collect_exceptions_when_gathering)
if __name__ == "__main__":
try:
asyncio.run(start_doing_stuff(collect_exceptions_when_gathering=False), debug=True)
except KeyboardInterrupt:
print("User aborted through keyboard")
...and the debug console will tell you something along the lines of:
Executing <Task finished name='Task-2' coro=<possibly_dangerous_sleep() done, defined at ~/src/hej.py:5> result=None created at ~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/tasks.py:337> took 10.005 seconds
Rest assured that the above call to sleep(10)
isn't the culprit in the library I'm working on, but it illustrates the issue I'm running into: if I try to interrupt the above test application within the first 2 to 12 seconds of it running, the debug console will end up with a hefty source traceback:
Fine to cancel me within the next 2 seconds
Not fine to cancel me within the next 10 seconds UNLESS someone is awaiting me, e.g. asyncio.gather()
^CDone with: <Task finished name='Task-2' coro=<possibly_dangerous_sleep() done, defined at ~/src/hej.py:5> exception=KeyboardInterrupt() created at ~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/tasks.py:337>
User aborted through keyboard
Task exception was never retrieved
future: <Task finished name='Task-2' coro=<dangerous_sleep() done, defined at ~/src/hej.py:5> exception=KeyboardInterrupt() created at ~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/tasks.py:337>
source_traceback: Object created at (most recent call last):
File "~/.pyenv/versions/3.10.0/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "~/.pyenv/versions/3.10.0/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "~/.vscode/extensions/ms-python.python-2021.12.1559732655/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
cli.main()
File "~/.vscode/extensions/ms-python.python-2021.12.1559732655/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "~/.vscode/extensions/ms-python.python-2021.12.1559732655/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
File "~/.pyenv/versions/3.10.0/lib/python3.10/runpy.py", line 269, in run_path
return _run_module_code(code, init_globals, run_name,
File "~/.pyenv/versions/3.10.0/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "~/.pyenv/versions/3.10.0/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "~/src/hej.py", line 37, in <module>
asyncio.run(start_doing_stuff(), debug=True)
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/base_events.py", line 628, in run_until_complete
self.run_forever()
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
self._run_once()
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/base_events.py", line 1873, in _run_once
handle._run()
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "~/src/hej.py", line 28, in start_doing_stuff
task = asyncio.create_task(dangerous_sleep())
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/tasks.py", line 337, in create_task
task = loop.create_task(coro)
Traceback (most recent call last):
File "~/src/hej.py", line 37, in <module>
asyncio.run(start_doing_stuff(), debug=True)
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/base_events.py", line 628, in run_until_complete
self.run_forever()
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
self._run_once()
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/base_events.py", line 1873, in _run_once
handle._run()
File "~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "~/src/hej.py", line 14, in dangerous_sleep
sleep(10)
KeyboardInterrupt
If I replace await asyncio.sleep(3600)
with await asyncio.gather(task)
(see the example code) and invoke CTRL-C, I instead get a very neat shutdown sequence in my debug console:
Fine to cancel me within the next 2 seconds
Not fine to cancel me within the next 10 seconds UNLESS someone is awaiting me, e.g. asyncio.gather()
^CDone with: <Task finished name='Task-2' coro=<possibly_dangerous_sleep() done, defined at ~/src/hej.py:5> exception=KeyboardInterrupt() created at ~/.pyenv/versions/3.10.0/lib/python3.10/asyncio/tasks.py:337>
User aborted through keyboard
Can someone explain to me if this is by design? I was expecting all asyncio tasks to be cancelled for me when asyncio.run() was interrupted (while cleaning up after itself).
Solution
Summary: You need to handle your exceptions, or asyncio will complain.
For background tasks (i.e. tasks that you don't explicitly wait for using gather()
)
You might think that trying to catch cancellation using except asyncio.CancelledError
(and re-raising it) within your task would handle all types of cancellation. That's not the case. If your task is performing blocking work while being cancelled, you won't be able to catch the exception (e.g. KeyboardInterrupt
) within the task itself. The safe bet here is to register a done callback using add_done_callback
on your asyncio.Task
. In this callback, check if there was an exception (see the updated example code in the question). If your task was stuck on blocking work while being cancelled, the done callback will tell you that the task was done (vs cancelled).
For a bunch of tasks that you await using gather()
If you use gather, you don't need to add done callbacks. Instead, ask it to return any exceptions and it will handle KeyboardInterrupt
just fine. If you don't do this, the first exception being raised within any of its awaitables is immediately propagated to the task that awaits on gather()
. In the case of a KeyboardInterrupt
inside a task that's stuck doing blocking work, KeyboardInterrupt
will be re-raised and you'll need to handle it. Alternatively, use try/except
to handle any exceptions raised. Please try this yourself by setting the collect_exceptions_when_gathering
variable in the example code.
Finally: the only thing I don't understand now is that I don't see any exception being raised if one calls gather()
with a single task, not asking it to return exceptions. Try to modify the example code to have its range be range(1,2)
and you won't get a messy stack trace on CTRL-C...?
Answered By - PythonInterrupt
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.