Issue
I have a module in Python 3.5+, providing a function that reads some data from a remote web API and returns it. The function relies on a wrapper function, which in turn uses the library requests
to make the HTTP call.
Here it is (omitting on purpose all data validation logic and exception handling):
# module fetcher.py
import requests
# high-level module API
def read(some_params):
resp = requests.get('http://example.com', params=some_params)
return resp.json()
# wrapper for the actual remote API call
def get_data(some_params):
return call_web_api(some_params)
The module is currently imported and used by multiple clients.
As of today, the call to get_data is inherently synchronous: this means that whoever uses the function fetcher.read()
knows that this is going to block the thread the function is executed on.
What I would love to achieve
I want to allow the fetcher.read()
to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).
This is in order to keep compatibility with existing callers consuming the module and at the same time to offer the possibility
to leverage non-blocking calls to allow a better throughput for callers that do want to call the function asynchronously.
This said, my legitimate wish is to modify the original code as little as possible...
As of today, the only thing I know is that Requests does not support asynchronous operations out of the box and therefore I should switch to an asyncio-friendly HTTP client (eg. aiohttp
) in order to provide a non-blocking behaviour
How would the above code need to be modified to meet my desiderata? Which also leads me to ask: is there any best practice about enhancing sync software APIs to async contexts?
Solution
I want to allow the
fetcher.read()
to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).
I don't think it is feasible for the same function to be usable via both sync and async API because the usage patterns are so different. Even if you could somehow make it work, it would be just too easy to mess things up, especially taking into account Python's dynamic-typing nature. (For example, users might accidentally forget to await
their functions in async code, and the sync code would kick in, thus blocking their event loop.)
Instead, I would recommend the actual API to be async, and to create a trivial sync wrapper that just invokes the entry points using run_until_complete
. Something along these lines:
# new module afetcher.py (or fetcher_async, or however you like it)
import aiohttp
# high-level module API
async def read(some_params):
async with aiohttp.request('GET', 'http://example.com', params=some_params) as resp:
return await resp.json()
# wrapper for the actual remote API call
async def get_data(some_params):
return call_web_api(some_params)
Yes, you switch from using requests
to aiohttp
, but the change is mechanical as the APIs are very similar in spirit.
The sync module would exist for backward compatibility and convenience, and would trivially wrap the async functionality:
# module fetcher.py
import afetcher
def read(some_params):
loop = asyncio.get_event_loop()
return loop.run_until_complete(afetcher.read(some_params))
...
This approach provides both sync and async version of the API, without code duplication because the sync version consists of trivial trampolines, whose definition can be further compressed using appropriate decorators.
The async fetcher module should have a nice short name, so that the users don't feel punished for using the async functionality. It should be easy to use, and it actually provides a lot of new features compared to the sync API, most notably low-overhead parallelization and reliable cancellation.
The route that is not recommended is using run_in_executor
or similar thread-based tool to run requests
in a thread pool under the hood. That implementation doesn't provide the actual benefits of using asyncio, but incurs all the costs. In that case it is better to continue providing the synchronous API and leave it to the users to use concurrent.futures
or similar tools for parallel execution, where they're at least aware they're using threads.
Answered By - user4815162342
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.