Async For Dummies

An Introduction to Asynchronous Programming using Python’s asyncio

Asynchronous programming solves a VERY specific problem: if you have a program that sits there waiting for some other routine to complete, but (crucially) could be doing other things while waiting for the aforementioned routine to complete then you might want to use asnychronous programming.

(This is a Jupyter Notebook!  Follow along with the most recent version on github)

What might a routine look like that waits around? Perhaps an i/o routine!? Yes! That’s where the python package name comes from: asyncio provides a way to call i/o routines and then wait for them to finish and start doing something else whilst you wait for the io operation to finish. Another big use of asynchronous programming is for user interfaces where you are waiting for input most of the time but then need to quickly process the input and go back to waiting for more input.

A Diagram of the difference between snychronous and asynchronous programming
Figure 1: Proof why it’s a good thing I’m not a front-end developer and a diagram of why asynchronous programming can be faster.

You might rightfully be asking how/why this is different from using threads. Well it’s similar but different. You could do everything here by spawning multiple threads, but threads aren’t free as far as resources go: you may be in an instance where you need to only use one thread. More importantly, multi-threading is actually going to have to run like the synchronous example in Figure 1 above! If you have to wait for something, the thread has to actively wait for the something to complete, unless you’re using a asynchronous programming.

Enough talking, let’s get started with some code! The syntax for asyncio got much cleaner and easier to follow/read as of version 3.6 so I need to make sure we’re using version 3.6 or later

In [1]:
import sys
assert sys.version_info.major >= 3
assert sys.version_info.minor >= 6

import asyncio
import time

The key object here which allows asynchronous programming to work is called the event loop:

In [2]:
loop = asyncio.get_event_loop()
# loop.set_debug(True) #this provides MUCH more helpful error messages; always do this when debugging/developing
# Not set here to keep the notebook output looking cleaner

You place objects called coroutines into the loop. A coroutine is an object which knows how to cede control back to the event loop and then let the event loop know when it is ready to resume control. The simplest coroutine is one that just sleeps by ceding control to the event loop and then prints something:

In [3]:
async def simple_sleeping_coroutine(label):
    waiting_time_start = time.time()
    await asyncio.sleep(1.0)
    print(f"routine {label} waited {time.time() - waiting_time_start} s")

There are a couple of new things here for a lot of programmers. First off, the keyword async tells the interpreter that we are writing a coroutine and thus that it is an object which can be put into the event loop.

Then the keyword await (expression) is where the magic happens! It tells the interpreter: “OK! I’m done until the expression is done computing, then I need to get control back eventually”. Crucially, that expression must iteself be a coroutine and thus can talk to the interpreter. True to the spirit of python syntax which came before it, await is just like reading plain language: await this result to be finished. We now have a coroutine; let’s put this guy into the event loop.

In [4]:
routine a waited 1.0013272762298584 s

This is a rather simple example: we put only one coroutine on the event loop and we ran it until it completed. The real power comes from placing multiple coroutines on the event loop which we can do using asyncio.gather.

In [5]:
loop.run_until_complete(asyncio.gather(simple_sleeping_coroutine("a"), simple_sleeping_coroutine("b")))
routine b waited 1.0013608932495117 s
routine a waited 1.0015013217926025 s
[None, None]

You will note that the above routine outputed [None, None] and you can indeed use return statements inside of coroutines, like so:

In [6]:
async def less_simple_sleeping_coroutine(label):
    waiting_time_start = time.time()
    await asyncio.sleep(1.0)
    sleeping_time_s = time.time() - waiting_time_start
    print(f"routine {label} waited {sleeping_time_s} s")
    return sleeping_time_s

You can also use a generator to make multiple coroutines: it’s just an object and doesn’t get called until the event loop does, so no reason you can’t.

In [7]:
loop.run_until_complete(asyncio.gather(*[less_simple_sleeping_coroutine(i) for i in range(5)]))
routine 4 waited 1.0014550685882568 s
routine 2 waited 1.0015904903411865 s
routine 0 waited 1.002209186553955 s
routine 1 waited 1.0022671222686768 s
routine 3 waited 1.0026037693023682 s

It’s also worth noting that because only one function is ever running at the same time, you can use a simple collection variable like a list instead of the necessary thread safe containers if you were using a threading or multiprocessing solution to your problem. Let me reiterate since this is really, really cool and does a lot to make asynchronous programming so attractive as a paradigm: you can use any data structure you want when programming asynchronously. That may be hard to follow so here’s an example:

In [8]:
collection_list = []
async def collection_example_sleeping_coroutine(label):
    waiting_time_start = time.time()
    await asyncio.sleep(1.0)
    sleeping_time_s = time.time() - waiting_time_start
    print(f"routine {label} waited {sleeping_time_s} s")
loop.run_until_complete(asyncio.gather(*[collection_example_sleeping_coroutine(i) for i in range(5)]))
routine 0 waited 1.0014450550079346 s
routine 1 waited 1.0015833377838135 s
routine 2 waited 1.002211570739746 s
routine 3 waited 1.0024409294128418 s
routine 4 waited 1.0026600360870361 s

You might be wondering why the routine’s output when they did. The short answer is that basically the event loop does what it will do and it tries its best to schedule efficiently but you can’t be guaranteed–unless you program it to do so–that one routine will come before another.

The End of Contrived Examples: a webcrawler

OK well that’s all been well and cool but still completely contrived examples, since all they did was sleep; no i/o was done… until now! So let’s examine something that this would be very useful for: web-crawling. Say you have a list of websites you want to grab data from:

In [9]:
url_test_list = [

These websites will all take a different amount of time to load based on a bunch of factors: how much data they’re serving, what their code that delivers that data is and many other things. That alone isn’t necessarily enough to justify using asynchronous programming, but say we want to do something with the data once we’ve fetched it: perhaps count all instances of < and > in the html of the returned website. That is indeed something we can do while waiting for another website to load so this is a PERFECT instance of how asyncio should be used!

So with that in mind, let’s write our fetcher coroutine. We’re going to need a connection library, however, that is a coroutine. Asyncio has network connection primitives but in the interest of conciseness, I’m going to use aiohttp which provides everything we need for this example in a simple package. You can also easily write servers with that package but we’re just focusing on a client for now. OK!

In [10]:
import aiohttp
import async_timeout

# this is basically just taken from the example on their website
async def async_fetch_text(url):
        # These are coroutine context managers: pretty cool!  
        async with aiohttp.ClientSession() as session:  #wait to get the session, cede control to the event loop
            async with async_timeout.timeout(10):
                async with session.get(url) as response:
                    return await response.text()
    except aiohttp.ClientConnectorError:
        print(f"cannot connect to {url}")
        return ""

# this doesn't really need to be async because it needs the entire thread's attention to calculate
def count_angle_brakets(text_input):
    left_count = 0
    right_count = 0
    for char in text_input:
        if char == '<':
            left_count += 1
        if char == '>':
            right_count += 1
    return left_count, right_count

async def async_worker(url):
    start_time = time.time()
    url_text = await async_fetch_text(url)
    print(f"fetching {url} took {time.time() - start_time} s")
    count =  count_angle_brakets(url_text)
    return url, count

%time loop.run_until_complete(asyncio.gather(*[async_worker(url) for url in url_test_list]))
cannot connect to http://fakey.fakefake
fetching http://fakey.fakefake took 0.01189732551574707 s
fetching took 0.044196367263793945 s
fetching took 0.1682143211364746 s
fetching took 0.16768479347229004 s
fetching took 0.1816091537475586 s
fetching took 0.28900766372680664 s
fetching took 0.35688257217407227 s
fetching took 0.40779542922973633 s
fetching took 0.6077656745910645 s
fetching took 1.211148977279663 s
CPU times: user 240 ms, sys: 24 ms, total: 264 ms
Wall time: 1.25 s
[('', (1133, 1133)),
 ('', (161, 160)),
 ('', (4087, 4084)),
 ('', (1488, 1488)),
 ('', (454, 459)),
 ('', (15, 15)),
 ('', (1906, 1906)),
 ('', (4133, 4078)),
 ('', (909, 909)),
 ('http://fakey.fakefake', (0, 0))]

Just to convince you that this is indeed faster than doing this synchronously, let’s quickly do it the old fashioned synchronous way!

In [11]:
import requests
def sync_fetch_text(url):
        r = requests.get(url)
        return r.text
    except requests.ConnectionError:
        print(f"Cannot connect to {url}")
        return ""

def sync_worker(url):
    start_time = time.time()
    url_text = sync_fetch_text(url)
    print(f"fetching {url} took {time.time() - start_time} s")
    count =  count_angle_brakets(url_text)
    return url, count

%time [sync_worker(url) for url in url_test_list]
fetching took 0.6322219371795654 s
fetching took 0.38666296005249023 s
fetching took 0.30480074882507324 s
fetching took 0.14882588386535645 s
fetching took 0.20450830459594727 s
fetching took 0.16111159324645996 s
fetching took 0.046398162841796875 s
fetching took 0.11020135879516602 s
fetching took 0.29704856872558594 s
Cannot connect to http://fakey.fakefake
fetching http://fakey.fakefake took 0.003855466842651367 s
CPU times: user 260 ms, sys: 4 ms, total: 264 ms
Wall time: 2.33 s
[('', (1118, 1118)),
 ('', (161, 160)),
 ('', (23, 23)),
 ('', (1488, 1488)),
 ('', (454, 459)),
 ('', (15, 15)),
 ('', (1906, 1906)),
 ('', (119, 118)),
 ('', (909, 909)),
 ('http://fakey.fakefake', (0, 0))]

beautiful! The asynchronous solution is much quicker!

Various Tips

There are a few more things worth covering in a basic introduction:

Another way to start the event loop running.

The event loop can start in another way. We used gather and run_until_complete, but it also has a run_forever function, but to get there you have to register tasks with the loop first, like so:

In [12]:
<Task pending coro=<less_simple_sleeping_coroutine() running at <ipython-input-6-396f216a1440>:1>>

But if we run the loop forever, we can’t run anything more in the python interpreter! So I find it useful to define a function which stops the loop after a while to avoid running forever.

In [13]:
async def time_limiter(s_to_wait):
    await asyncio.sleep(s_to_wait)
%time loop.run_forever()
routine a waited 1.0014173984527588 s
routine b waited 1.0015285015106201 s
CPU times: user 0 ns, sys: 4 ms, total: 4 ms
Wall time: 3 s

This pattern would also make it possible to use a coroutine with a non-terminating loop to keep sending requests off to somewhere for some reason or another: very powerful!

Firing off a New Thread

If you have a routine you want to fire off in its own thread to run in its own process in parallel, you can do that using loop.run_in_executor. You might do this because you simply need to run a legacy function merged with your fancy modern asynchronous code. Or perhaps you have a heavy task you want to send off to its own process.

loop.run_in_executor needs an Executor object to tell it how to run, a function handle and then the arguments to that function. It is super useful anything that needs its own thread or process and for wrapping legacy code that you don’t have the time to re-write; it provides a nice bridge between asynchronous programming and parallel programming. But how do you use it?

In [14]:
#Thread execution:
def fibonacci(n):
    if n <= 1:
        return 1
        return fibonacci(n - 1) + fibonacci(n - 2)

async def async_fib_worker_thread(i):
    start_time = time.time()
    result = await loop.run_in_executor(None, fibonacci, i)
    print(f"Fib_{i}={result} calculated in {time.time() - start_time}")
    return result
%time fib_numbers = loop.run_until_complete(asyncio.gather(*[async_fib_worker_thread(i) for i in range(20,34)]))
Fib_20=10946 calculated in 0.5288066864013672
Fib_21=17711 calculated in 0.5657856464385986
Fib_22=28657 calculated in 0.5089752674102783
Fib_25=121393 calculated in 0.6511735916137695
Fib_23=46368 calculated in 0.5038025379180908
Fib_24=75025 calculated in 0.7797455787658691
Fib_26=196418 calculated in 1.229043960571289
Fib_27=317811 calculated in 1.7452843189239502
Fib_28=514229 calculated in 2.2597830295562744
Fib_29=832040 calculated in 3.533963918685913
Fib_30=1346269 calculated in 4.4870829582214355
Fib_31=2178309 calculated in 5.071220874786377
Fib_32=3524578 calculated in 7.060687780380249
Fib_33=5702887 calculated in 7.627435922622681
CPU times: user 7.9 s, sys: 4 ms, total: 7.9 s
Wall time: 7.71 s
[10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887]

Note that in case you weren’t certain this would happen before, that the output of run_until_complete is in order, despite that fact that it did not calculate everything in order. I wouldn’t ever rely on this being the case though.

Now let’s spin up new processes:

In [15]:
# Process Execution
import concurrent.futures
process_executor = concurrent.futures.ProcessPoolExecutor()
async def async_fib_worker_proc(i):
    start_time = time.time()
    result = await loop.run_in_executor(process_executor, fibonacci, i)
    print(f"Fib_{i}={result} calculated in {time.time() - start_time}")
    return result
%time fib_numbers = loop.run_until_complete(asyncio.gather(*[async_fib_worker_proc(i) for i in range(20,34)]))
Fib_20=10946 calculated in 0.02011728286743164
Fib_21=17711 calculated in 0.017861366271972656
Fib_22=28657 calculated in 0.02513408660888672
Fib_23=46368 calculated in 0.03589582443237305
Fib_24=75025 calculated in 0.04911947250366211
Fib_26=196418 calculated in 0.3784801959991455
Fib_25=121393 calculated in 0.13418269157409668
Fib_27=317811 calculated in 0.1956641674041748
Fib_28=514229 calculated in 0.21662235260009766
Fib_29=832040 calculated in 0.31270551681518555
Fib_30=1346269 calculated in 0.5519745349884033
Fib_31=2178309 calculated in 0.7180044651031494
Fib_32=3524578 calculated in 1.2151618003845215
Fib_33=5702887 calculated in 1.8036613464355469
CPU times: user 76 ms, sys: 252 ms, total: 328 ms
Wall time: 2.08 s
[10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887]

The process-based task should have been much quicker at the expense of using more system resources, because each coroutine got its own process, so it could run in parallel to the master event loop thread, whereas the one before it had to share time with all the other processes.

A Note on the Event Loop

In order to save resources, we really ought to close the event loop when we’re done with it. Don’t worry! You can get a new one back if you need it, but the old one is gone forever:

In [16]:
# to get it back:
loop = asyncio.new_event_loop()

Anyone can develop their own event loop, though, and some people have gone and done that using the same event loop which powers node.js and call is uvloop which claims to be 2-4x faster than the core python event loop for certain examples.

(Before you ask, the python core (cpython) developers don’t make it the default loop because it introduces extra dependencies because of how it’s written, and the python developers try to keep the number of dependencies in the core python as low as possible.)

Let’s see if uvloop is indeed faster for our web crawling example!

In [17]:
%time loop.run_until_complete(asyncio.gather(*[async_worker(url) for url in url_test_list]))
cannot connect to http://fakey.fakefake
fetching http://fakey.fakefake took 0.012985706329345703 s
fetching took 0.03833460807800293 s
fetching took 0.1383969783782959 s
fetching took 0.15240740776062012 s
fetching took 0.18699407577514648 s
fetching took 0.30530810356140137 s
fetching took 0.3951401710510254 s
fetching took 0.4036407470703125 s
fetching took 0.49355626106262207 s
fetching took 0.59002685546875 s
CPU times: user 200 ms, sys: 36 ms, total: 236 ms
Wall time: 644 ms
[('', (1124, 1124)),
 ('', (161, 160)),
 ('', (4091, 4088)),
 ('', (1488, 1488)),
 ('', (454, 459)),
 ('', (15, 15)),
 ('', (1906, 1906)),
 ('', (4229, 4174)),
 ('', (909, 909)),
 ('http://fakey.fakefake', (0, 0))]
In [18]:
loop.close() # since we're done with it!

# from the uvloop docs
import uvloop
# Tell asyncio to use uvloop to create new event loops
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop) # always register the new event loop with asyncio or it can get confused and give you odd errors when it tries to use an event loop other than the one you want it to
In [19]:
%time loop.run_until_complete(asyncio.gather(*[async_worker(url) for url in url_test_list]))
cannot connect to http://fakey.fakefake
fetching http://fakey.fakefake took 0.006371974945068359 s
fetching took 0.025042295455932617 s
fetching took 0.14052820205688477 s
fetching took 0.14242053031921387 s
fetching took 0.16412138938903809 s
fetching took 0.3040812015533447 s
fetching took 0.3699519634246826 s
fetching took 0.3802640438079834 s
fetching took 0.5155801773071289 s
fetching took 0.6419394016265869 s
CPU times: user 196 ms, sys: 12 ms, total: 208 ms
Wall time: 697 ms
[('', (1118, 1118)),
 ('', (161, 160)),
 ('', (4087, 4084)),
 ('', (1488, 1488)),
 ('', (454, 459)),
 ('', (15, 15)),
 ('', (1906, 1906)),
 ('', (4221, 4166)),
 ('', (909, 909)),
 ('http://fakey.fakefake', (0, 0))]

So indeed a bit faster! Very cool!

Further Reading

Even the core developers of python/asyncio will admit the current documentation is pretty terrible. But now if you followed everything I wrote, I hope the documentation will make a lot more sense. What you can do with asnycio is very very rich: this guide only scratched the surface. The video linked above is a great resouce, and here are some other good resources to learn more:


I want to thank Yury Selivanov because this document is essentially lecture notes on his videos that I wrote while trying to wrap my head around asynchronous programming. Also want to thank my employer ZipRectuiter for requiring that I understand asynchronous programming for a recent project and letting me write and post this.

And thank you for reading!

In [ ]: