A lot of people have problems working with and understanding what asynchronous code means. This article is an intro into the basic asynchronous Python, where I'll try answering the question why or how the async makes your code to go fast.
Let's start with the simplest definition. Async is a style of concurrent programming. It's the most generic term that means doing many things at once.
How Does Python Do Multiple Things At Once?
1. Multiple Processes
The most obvious way is to use multiple processes. From the terminal you can start your script two, three, four…ten times and then all the scripts are going to run independently or at the same time. The operating system that's underneath will take care of sharing your CPU resources among all those instances.
Using CPython that's actually the only way you can get to use more than one CPU at the same time.
2. Multiple Threads
The next way to run multiple things at once is to use threads.
A thread is a line of execution, pretty much like a process, but you can have multiple threads in the context of one process and they all share access to common resources. But because of this it's difficult to write a threading code. And again, the operating system is doing all the heavy lifting on sharing the CPU, but the global interpreter lock (GIL) allows only one thread to run Python code at a given time even when you have multiple threads running code. So, In CPython, the GIL prevents multi core concurrency. Basically, you’re running in a single core even though you may have two or four or more.
3. Asynchronous Programming
The third way is an asynchronous programming, where the OS is not participating. As far as OS is concerned you're going to have one process and there's going to be a single thread within that process, but you'll be able to do multiple things at once. So, what's the trick?
As an example we can take a chess exhibition where one of the best chess players competes against a lot of people. And if there are 24 games with 24 people to play with and the chess master plays with all of them synchronically, it'll take at least 12 hours (taking into account that the average game takes 30 moves, the chess master thinks for 5 seconds to come up with a move and the opponent - for approximately 55 seconds). But using the asynchronous mode gives chess master the opportunity to make a move and leave the opponent thinking while going to the next one and making a move there. This way a move on all 24 games can be done in 2 minutes and all of them can be won in just one hour.
So, this is what's meant when people talk about asynchronous being really fast. It's this kind of fast. Chess master doesn't play chess faster, the time is just more optimized and it's not get wasted on waiting around. This is how it works.
In this analogy the chess master will be our CPU and the idea is that we wanna make sure that the CPU doesn't wait or waits the least amount of time possible. It's about always finding something to do.
A practical definition of Async is that it's a style of concurrent programming in which tasks release the CPU during waiting periods, so that other tasks can use it.
How is Async Implemented?
You probably want to know a little bit more. How can you do that with one process and one thread? You need two things, basically.
The first thing that you need is to have a function that can suspend and resume. A function that enters a waiting period is suspended, and only resumed when the wait is over. There are four ways in which you can do this in Python without OS help. Those ways are:
- the callback functions, but they are pretty gross, so no examples will be provided,
- the generator functions - a Python feature that have been there for a long time,
- async/await - the keywords that you can use in Python 3.5+,
- and the third-party package called greenlet - that actually implement this as a C extension Python which you can install with pip. But in this article we aren't gonna get into them.
The next thing that we need is a piece of code that can decide how the CPU is shared, which function gets the CPU next. So, we need a scheduler of sorts. And in asynchronous programming this is called an event loop.
Scheduling Asynchronous Tasks
An event loop will know all the tasks that are running or want to run, it'll select one and give control to it. That task is going to suspend when it needs to wait for something. The control will go back to the loop and the last will find another task and it'll keep going that way. This is called cooperative multi-tasking.
This is a super simple test. Let's say that we wanna write a little script that prints 'hello' waits 3 seconds and then prints 'world'.
Example: Standard (synchronous) Python
from time import sleep
def hello():
print('Hello')
sleep(3)
print('World!')
if _name_ == '_main_':
hello()
If we were to put a loop on that 'hello' on the bottom to run 'hello' 10 times, for example, this is gonna run not for 3 seconds but for 30 seconds.
Examples: Asyncio
import asyncio loop = asyncio.get_event_loop() @asyncio.coroutine def hello(): print('Hello') yield from asyncio.sleep(3) print('World!') if _name_ == '_main_': loop.run_until_complete(hello())
In the first example we're using a generator function. Generators are these special functions that typically you use in Python to generate sequences of items. The nice thing about them is that you don't have to pre-generate all the entire sequence you can generate elements of that sequence as the person calling the generator asks. You can repurpose that using yield
or yield from
keywords and also use it for an asynchronous function. Basically, when we reach the yield from
in the example above, we are saying: "OK, loop, I'm done for now, so I give you back control. Please, run this function for me [the one that follows the yield from
], so asyncio sleep for 3 seconds. And when that's done I'm ready to continue". The loop will take note of that and then manage everything. Because it’s a scheduler and that's what it does.
So, if you were to call this 'hello' function 10 times instead of running for 30 seconds you're gonna see 10 hellos then a pause for 3 seconds and then you’re gonna see 10 'worlds'.
import asyncio loop = asyncio.get_event_loop() async def hello(): print('Hello') await asyncio.sleep(3) print('World!') if _name_ == '_main_': loop.run_until_complete(hello())
In the second example you can see an improvement that you get in recent Pythons - a much nicer syntax. Functionally these two are equivalent. You have an async def
declaration, that's what you use to define an asynchronous function. And if you use that syntax to declare the function then you get to use await for the suspension and resuming.
The one of the things that asyncio is great for is that it makes very explicit the points where the code suspends and resumes.
Async Pitfalls
1. What happens if you have an asynchronous program with one or more tasks that need to do some heavy CPU calculation?
So, the problem is if you use the CPU in your function for, say, one minute then during that minute nothing else will happen, because this's a single thread. All tasks need to be nice to the remaining tasks and release the CPU often. You have to call 'sleep' every once in a while in a function as often as you can. But if you're really greedy and you don’t want to give up the time that you've got the best you can do is you can sleep for zero seconds. This is basically saying the loop that you need control back as soon as possible. If your calculation has a loop, which is pretty common, then you stick a 'sleep 0' inside that loop. This way once per iteration you allow other tasks to continue running.
Example: await asyncio.sleep(0)
2. The second problem is that the blocking library functions are incompatible with async frameworks.
So, there's a bunch of things in Python Standard library that are assigned as blocking functions: socket.*, select.*, subprocess.*, os.waitpid, threading.*, multiprocessing.*, time.sleep. Everything that has to do with networking, processes, threads, you cannot use them. This is true for every async framework. If you use these functions the thing is gonna hang. So, don't use them. It's very unfortunate.
All async frameworks provide replacements for these functions. And sometimes it’s kind of sucks, because you have to learn a different way to do the things that you know how to do. All these common things that you do with processes, threads and networking. Unfortunate, but true. It's true for Asyncio, Eventlet, Gevent, Twisted, Curio, all of them. They all provide alternative ways to do these blocking things.
Conclusion
Let's summarise this with a little table that compares processes, threads and async on a number of categories.
Processes vs. Threads vs. Async
Processes | Threads | Async | |
---|---|---|---|
Optimize waiting periods | Yes (preemptive) | Yes (preemptive) | Yes (cooperative) |
Use all CPU cores | Yes | No | No |
Scalability | Low (ones/tens) | Medium (hundreds) | High (thousands+) |
Use blocking std library functions | Yes | Yes | No |
GIL interference | No | Some | No |
You might think that these are super cool non-blocking doing something while a task waits exclusive to async and that is not true. Processes and thread can do that pretty well too. And it's actually not Python doing it, in that case it's an operating system doing it. So, there's no winner here.
There is a slight difference there - in the processes and threads case it's the OS doing it, and in the async case you have your async framework (asyncio, gevent and so on) that doing it. That's cooperative for async and preemptive (which is called when the OS yanks the CPU out without you knowing it) for processes and threads.
If you want to maximize your multiple cores in your computer then the only way is processes. And many times people combine processes with one of the other two. They run a multi thread on an async program and then they run it as many times as cores they have.
Scalability is an interesting one, because if you're running multiple processes each process will have a copy of the Python interpreter and all the resources that it uses plus a copy of your code, your application, plus all the resources that you use, so all of that is gonna be duplicated. If you start going crazy and start new instances you’re gonna find that pretty soon you're probably gonna be out of memory. You cannot run a lot of Python processes on a normal computer. So, the scalability is pretty low with processes, in the ones or the tens but no more than that. If you go to threads, they are a little bit more lightweight than processes, so you can initiate much more threads than processes. This way you can scale a little bit better, say, in the hundreds. If you go with async, there everything is done in Python space, there are no resources at the OS level that are used, so these are extremely lightweight. This is a clear winner. Async can go into thousands or tens of thousands even. So, this can be a good reason to go async.
Now we have the bad news of using the blocking functions in the Python Standard library which processes and threads can do no problem because the OS knows how to deal with those. But when we lose the support of the OS in async we cannot use those functions and we need replacements. And the last is the Global Interpreter Lock. It causes some troubles with threads, but it’s not that bad when you have the types of applications that are good for also the async which are heavily io. Because if you have threads that are blocked on io then you don't hold the GIL. So, if a thread goes to wait then the OS will be able to give access to another thread without any problems.
So, really, it's not that great. There arenэt that many things that are better for async. Basically, the best argument to go async is when you really need massive scaling. These would be servers that are going to be very busy. Async can go into thousands or tens of thousands of connections and it's like nothing, not a problem. In any other category it's not really clear that you should go async unless you like it.
Appendix
The article was inspired by the Miguel Grinberg's talk on the PyCon 2017.