Issue
Today, I found a library named trio which says itself is an asynchronous API for humans. These words are a little similar with requests
'. As requests
is really a good library, I am wondering what is the advantages of trio
.
There aren't many articles about it, I just find an article discussing curio
and asyncio
. To my surprise, trio
says itself is even better than curio
(next-generation curio).
After reading half of the article, I cannot find the core difference between these two asynchronous framework. It just gives some examples that curio
's implementation is more convenient than asyncio
's. But the underlying structure is almost the same.
So could someone give me a reason I have to accept that trio
or curio
is better than asyncio
? Or explain more about why I should choose trio
instead of built-in asyncio
?
Solution
Where I'm coming from: I'm the primary author of trio. I'm also one of the top contributors to curio (and wrote the article about it that you link to), and a Python core dev who's been heavily involved in discussions about how to improve asyncio.
In trio (and curio), one of the core design principles is that you never program with callbacks; it feels more like thread-based programming than callback-based programming. I guess if you open up the hood and look at how they're implemented internally, then there are places where they use callbacks, or things that are sorta equivalent to callbacks if you squint. But that's like saying that Python and C are equivalent because the Python interpreter is implemented in C. You never use callbacks.
Anyway:
Trio vs asyncio
Asyncio is more mature
The first big difference is ecosystem maturity. At the time I'm writing this in March 2018, there are many more libraries with asyncio support than trio support. For example, right now there aren't any real HTTP servers with trio support. The Framework :: AsyncIO classifier on PyPI currently has 122 libraries in it, while the Framework :: Trio classifier only has 8. I'm hoping that this part of the answer will become out of date quickly – for example, here's Kenneth Reitz experimenting with adding trio support in the next version of requests – but right now, you should expect that if you're trio for anything complicated, then you'll run into missing pieces that you need to fill in yourself instead of grabbing a library from pypi, or that you'll need to use the trio-asyncio package that lets you use asyncio libraries in trio programs. (The trio chat channel is useful for finding out about what's available, and what other people are working on.)
Trio makes your code simpler
In terms of the actual libraries, they're also very different. The main argument for trio is that it makes writing concurrent code much, much simpler than using asyncio. Of course, when was the last time you heard someone say that their library makes things harder to use... let me give a concrete example. In this talk (slides), I use the example of implementing RFC 8305 "Happy eyeballs", which is a simple concurrent algorithm used to efficiently establish a network connection. This is something that Glyph has been thinking about for years, and his latest version for Twisted is ~600 lines long. (Asyncio would be about the same; Twisted and asyncio are very similar architecturally.) In the talk, I teach you everything you need to know to implement it in <40 lines using trio (and we fix a bug in his version while we're at it). So in this example, using trio literally makes our code an order of magnitude simpler.
You might also find these comments from users interesting: 1, 2, 3
There are many many differences in detail
Why does this happen? That's a much longer answer :-). I'm gradually working on writing up the different pieces in blog posts and talks, and I'll try to remember to update this answer with links as they become available. Basically, it comes down to Trio having a small set of carefully designed primitives that have a few fundamental differences from any other library I know of (though of course build on ideas from lots of places). Here are some random notes to give you some idea:
A very, very common problem in asyncio and related libraries is that you call some_function()
, and it returns, so you think it's done – but actually it's still running in the background. This leads to all kinds of tricky bugs, because it makes it difficult to control the order in which things happen, or know when anything has actually finished, and it can directly hide problems because if a background task crashes with an unhandled exception, asyncio will generally just print something to the console and then keep going. In trio, the way we handle task spawning via "nurseries" means that none of these things happen: when a function returns then you know it's done, and Trio's currently the only concurrency library for Python where exceptions always propagate until you catch them.
Trio's way of managing timeouts and cancellations is novel, and I think better than previous state-of-the-art systems like C# and Golang. I actually did write a whole essay on this, so I won't go into all the details here. But asyncio's cancellation system – or really, systems, it has two of them with slightly different semantics – are based on an older set of ideas than even C# and Golang, and are difficult to use correctly. (For example, it's easy for code to accidentally "escape" a cancellation by spawning a background task; see previous paragraph.)
There's a ton of redundant stuff in asyncio, which can make it hard to tell which thing to use when. You have futures, tasks, and coroutines, which are all basically used for the same purpose but you need to know the differences between them. If you want to implement a network protocol, you have to pick whether to use the protocols/transports layer or the streams layer, and they both have tricky pitfalls (this is what the first part of the essay you linked is about).
Trio's currently the only concurrency library for Python where control-C just works the way you expect (i.e., it raises KeyboardInterrupt
where-ever your code is). It's a small thing, but it makes a big difference :-). For various reasons, I don't think this is fixable in asyncio.
Summing up
If you need to ship something to production next week, then you should use asyncio (or Twisted or Tornado or gevent, which are even more mature). They have large ecosystems, other people have used them in production before you, and they're not going anywhere.
If trying to use those frameworks leaves you frustrated and confused, or if want to experiment with a different way of doing things, then definitely check out trio – we're friendly :-).
If you want to ship something to production a year from now... then I'm not sure what to tell you. Python concurrency is in flux. Trio has many advantages at the design level, but is that enough to overcome asyncio's head start? Will asyncio being in the standard library be an advantage, or a disadvantage? (Notice how these days everyone uses requests
, even though the standard library has urllib.) How many of the new ideas in trio can be added to asyncio? No-one knows. I expect that there will be a lot of interesting discussions about this at PyCon this year :-).
Answered By - Nathaniel J. Smith
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.