@@ -10,9 +10,6 @@ This article seeks to help you build a sturdy mental model of how asyncio
1010fundamentally works.
1111Something that will help you understand the how and why behind the recommended
1212patterns.
13- The final section, :ref: `which_concurrency_do_I_want `, zooms out a bit and
14- compares the common approaches to concurrency -- multiprocessing,
15- multithreading & asyncio -- and describes where each is most useful.
1613
1714During my own asyncio learning process, a few aspects particually drove my
1815curiosity (read: drove me nuts).
@@ -27,15 +24,6 @@ of this article.
2724- How would I go about writing my own asynchronous variant of some operation?
2825  Something like an async sleep, database request, etc.
2926
30- The first two sections feature some examples but are generally focused on theory
31- and explaining concepts.
32- The next two sections are centered around examples, focused on further
33- illustrating and reinforcing ideas practically.
34- 
35- .. contents :: Sections 
36-     :depth:  1
37-     :local: 
38- 
3927--------------------------------------------- 
4028A conceptual overview part 1: the high-level
4129--------------------------------------------- 
@@ -489,368 +477,3 @@ For reference, you could implement it without futures, like so::
489477            else: 
490478                await YieldToEventLoop() 
491479
492- 
493- .. _anaylzing-control-flow-example :
494- 
495- ---------------------------------------------- 
496- Analyzing an example programs control flow
497- ---------------------------------------------- 
498- 
499- We'll walkthrough, step by step, a simple asynchronous program following along
500- in the key methods of Task & Future that are leveraged when asyncio is
501- orchestrating the show.
502- 
503- 
504- =============== 
505- Task.step
506- =============== 
507- 
508- The actual method that invokes a Tasks coroutine:
509- ``asyncio.tasks.Task.__step_run_and_handle_result `` is about 80 lines long.
510- For the sake of clarity, I've removed all of the edge-case error handling,
511- simplified some aspects and renamed it, but the core logic remains unchanged.
512- 
513- ::
514- 
515-     1  class Task(Future): 
516-     2      ... 
517-     3      def step(self): 
518-     4          try: 
519-     5              awaited_task = self.coro.send(None) 
520-     6          except StopIteration as e: 
521-     7              super().set_result(e.value) 
522-     8          else: 
523-     9             awaited_task.add_done_callback(self.__step) 
524-     10         ... 
525- 
526- 
527- ====================== 
528- Example program
529- ====================== 
530- 
531- ::
532- 
533-     # Filename: program.py 
534-     1  async def triple(val: int): 
535-     2      return val * 3 
536-     3 
537-     4  async def main(): 
538-     5      triple_task = asyncio.Task(coro=triple(val=5)) 
539-     6      tripled_val = await triple_task 
540-     7      return tripled_val + 2 
541-     8 
542-     9  loop = asyncio.new_event_loop() 
543-     10 main_task = asyncio.Task(main(), loop=loop) 
544-     11 loop.run_forever() 
545- 
546- ===================== 
547- Control flow
548- ===================== 
549- 
550- At a high-level, this is how control flows:
551- 
552- .. code-block :: none 
553- 
554-     1  program 
555-     2      event-loop 
556-     3          main_task.step 
557-     4              main() 
558-     5                  triple_task.__await__ 
559-     6              main() 
560-     7          main_task.step 
561-     8      event-loop 
562-     9          triple_task.step 
563-     10             triple() 
564-     11         triple_task.step 
565-     12     event-loop 
566-     13         main_task.step 
567-     14             triple_task.__await__ 
568-     15                 main() 
569-     16         main_task.step 
570-     17     event-loop 
571- 
572- 
573- 
574- 1. Control begins in ``program.py ``
575-         Line 9 creates an event-loop, line 10 creates ``main_task `` and adds it to
576-         the event-loop, line 11 indefinitely passes control to the event-loop.
577- 2. Control is now in the event-loop
578-         The event-loop pops ``main_task `` off its queue, then invokes it by calling
579-         ``main_task.step() ``.
580- 3. Control is now in ``main_task.step ``
581-         We enter the try-block on line 4 then begin the coroutine ``main() `` on
582-         line 5.
583- 4. Control is now in the coroutine: ``main() ``
584-         The Task ``triple_task `` is created on line 5 which adds it to the
585-         event-loops queue. Line 6 ``await ``\  s triple_task.
586-         Remember, that calls ``Task.__await__ `` then percolates any ``yield ``\  s.
587- 5. Control is now in ``triple_task.__await__ ``
588-         ``triple_task `` is not done given it was just created, so we enter
589-         the first if-block on line 5 and ``yield `` the thing we'll be waiting
590-         for -- ``triple_task ``.
591- 6. Control is now in the coroutine: ``main() ``
592-         ``await `` percolates the ``yield `` and the yielded value -- ``triple_task ``.
593- 7. Control is now in ``main_task.step ``
594-         The variable ``awaited_task `` is ``triple_task ``.
595-         No ``StopIteration `` was raised so the else in the try-block on line 8
596-         executes.
597-         A done-callback: ``main_task.step `` is added to the ``triple_task ``.
598-         The step method ends and returns to the event-loop.
599- 8. Control is now in the event-loop
600-         The event-loop cycles to the next task in its queue.
601-         The event-loop pops ``triple_task `` from its queue and invokes it by
602-         calling ``triple_task.step() ``.
603- 9. Control is now in ``triple_task.step ``
604-         We enter the try-block on line 4 then begin the coroutine ``triple() ``
605-         via line 5.
606- 10. Control is now in the coroutine: ``triple() ``
607-         It computes 3 times 5, then finishes and raises a ``StopIteration ``
608-         exception.
609- 11. Control is now in ``triple_task.step ``
610-         The ``StopIteration `` exception is caught so we go to line 7.
611-         The return value of the coroutine ``triple() `` is embedded in the value
612-         attribute of that exception.
613-         ``Future.set_result() `` saves the result, marks the task as done and adds
614-         the done-callbacks of ``triple_task `` to the event-loops queue.
615-         The step method ends and returns control to the event-loop.
616- 12. Control is now in the event-loop
617-         The event-loop cycles to the next task in its queue.
618-         The event-loop pops ``main_task `` and resumes it by calling
619-         ``main_task.step() ``.
620- 13. Control is now in ``main_task.step ``
621-         We enter the try-block on line 4 then resume the coroutine ``main ``
622-         which will pick up again from where it ``yield ``-ed.
623-         Recall, it ``yield ``-ed not in the coroutine, but in
624-         ``triple_task.__await__ `` on line 6.
625- 14. Control is now in ``triple_task.__await__ ``
626-         We evaluate the if-statement on line 8 which ensures that ``triple_task ``
627-         was completed.
628-         Then, it returns the result of ``triple_task `` which was saved earlier.
629-         Finally that result is returned to the caller
630-         (i.e. ``... = await triple_task ``).
631- 15. Control is now in the coroutine: ``main() ``
632-         ``tripled_val `` is 15. The coroutine finishes and raises a
633-         ``StopIteration `` exception with the return value of 17 attached.
634- 16. Control is now in ``main_task.step ``
635-         The ``StopIteration `` exception is caught and ``main_task `` is marked
636-         as done and its result is saved.
637-         The step method ends and returns control to the event-loop.
638- 17. Control is now in the event-loop
639-         There's nothing in the queue.
640-         The event-loop cycles aimlessly onwards.
641- 
642- ---------------------------------------------- 
643- Barebones network I/O example
644- ---------------------------------------------- 
645- 
646- Here we'll see a simple but thorough example showing how asyncio can offer an
647- advantage over serial programs.
648- The example doesn't rely on any asyncio operators (besides the event-loop).
649- It's all non-blocking sockets & custom awaitables that help you see what's
650- actually happening under the hood and how you could do something similar.
651- 
652- Performing a database request across a network might take half a second or so,
653- but that's ages in computer-time.
654- Your processor could have done millions or even billions of things.
655- The same is true for, say, requesting a website, downloading a car, loading a
656- file from disk into memory, etc.
657- The general theme is those are all input/output (I/O) actions.
658- 
659- Consider performing two tasks: requesting some information from a server and
660- doing some computation locally.
661- A serial approach would look like: ping the server, idle while waiting for a
662- response, receive the response, perform the local computation.
663- An asynchronous approach would look like: ping the server, do some of the
664- local computation while waiting for a response, check if the server is ready
665- yet, do a bit more of the local computation, check again, etc.
666- Basically we're freeing up the CPU to do other activities instead of scratching
667- its belly button.
668- 
669- This example has a server (a separate, local process) compute the sum of many
670- samples from a Gaussian (i.e. normal) distribution.
671- And the local computation finds the sum of many samples from a uniform
672- distribution.
673- As you'll see, the asynchronous approach runs notably faster, since progress
674- can be made on computing the sum of many uniform samples, while waiting for
675- the server to calculate and respond.
676- 
677- ===================== 
678- Serial output
679- ===================== 
680- 
681- .. code-block :: none 
682- 
683-     $ python serial_approach.py 
684-     Beginning server_request. 
685-     ====== Done server_request. total: -2869.04. Ran for: 2.77s. ====== 
686-     Beginning uniform_sum. 
687-     ====== Done uniform_sum. total: 60001676.02. Ran for: 4.77s. ====== 
688-     Total time elapsed: 7.54s. 
689- 
690- ===================== 
691- Asynchronous output
692- ===================== 
693- 
694- .. code-block :: none 
695- 
696-     $ python async_approach.py 
697-     Beginning uniform_sum. 
698-     Pausing uniform_sum at sample_num: 26,999,999. time_elapsed: 1.01s. 
699- 
700-     Beginning server_request. 
701-     Pausing server_request. time_elapsed: 0.00s. 
702- 
703-     Resuming uniform_sum. 
704-     Pausing uniform_sum at sample_num: 53,999,999. time_elapsed: 1.05s. 
705- 
706-     Resuming server_request. 
707-     Pausing server_request. time_elapsed: 0.00s. 
708- 
709-     Resuming uniform_sum. 
710-     Pausing uniform_sum at sample_num: 80,999,999. time_elapsed: 1.05s. 
711- 
712-     Resuming server_request. 
713-     Pausing server_request. time_elapsed: 0.00s. 
714- 
715-     Resuming uniform_sum. 
716-     Pausing uniform_sum at sample_num: 107,999,999. time_elapsed: 1.04s. 
717- 
718-     Resuming server_request. 
719-     ====== Done server_request. total: -2722.46. ====== 
720- 
721-     Resuming uniform_sum. 
722-     ====== Done uniform_sum. total: 59999087.62 ====== 
723- 
724-     Total time elapsed: 4.60s. 
725- 
726- ====================== 
727- Code
728- ====================== 
729- 
730- Now, we'll explore some of the most important snippets.
731- 
732- Below is the portion of the asynchronous approach responsible for checking if
733- the server's done yet.
734- And, if not, yielding control back to the event-loop instead of idly waiting.
735- I'd like to draw your attention to a specific part of this snippet.
736- Setting a socket to non-blocking mode means the ``recv() `` call won't idle while
737- waiting for a response.
738- Instead, if there's no data to be read, it'll immediately raise a
739- ``BlockingIOError ``.
740- If there is data available, the ``recv() `` will proceed as normal.
741- 
742- .. code-block :: python 
743- 
744-     class  YieldToEventLoop : 
745-         def  __await__ (self  
746-             yield  
747-     ...  
748- 
749-     async  def  get_server_data (): 
750-         client =  socket.socket() 
751-         client.connect(server.SERVER_ADDRESS ) 
752-         client.setblocking(False ) 
753- 
754-         while  True : 
755-             try : 
756-                 #  For reference, the first argument to recv() is the maximum number 
757-                 #  of bytes to attempt to read. Setting it to 4096 means we could get 2 
758-                 #  bytes or 4 bytes, or even 4091 bytes, but not 4097 bytes back. 
759-                 #  However, if there are no bytes available to be read, this recv() 
760-                 #  will raise a BlockingIOError since the socket was set to 
761-                 #  non-blocking mode. 
762-                 response =  client.recv(4096 ) 
763-                 break  
764-             except  BlockingIOError : 
765-                 await  YieldToEventLoop() 
766-         return  response 
767- 
768- 
769- 
770- the uniform sums.
771- It's designed to allow for working through the sum a portion at a time.
772- The ``time_allotment `` argument to the coroutine function decides how long the sum
773- function will iterate, in other words, synchronously hog control, before ceding
774- back to the event-loop.
775- 
776- .. code-block :: python 
777- 
778-     async  def  uniform_sum (n_samples : int , time_allotment : float ) -> float : 
779- 
780-         start_time =  time.time() 
781- 
782-         total =  0.0  
783-         for  _ in  range (n_samples): 
784-             total +=  random.random() 
785- 
786-             time_elapsed =  time.time() -  start_time 
787-             if  time_elapsed >  time_allotment: 
788-                 await  YieldToEventLoop() 
789-                 start_time =  time.time() 
790- 
791-         return  total 
792- 
793- time.time() `` and evaluating
794- an if-condition on every iteration for many, many iterations (in this case
795- roughly a hundred million) more than eats up the runtime savings associated
796- with the asynchronous approach.
797- The actual implementation involves chunking the iteration, so you only perform
798- the check every few million iterations.
799- With that change, the asynchronous approach wins in a landslide.
800- This is important to keep in mind.
801- Too much checking or constantly jumping between tasks can ultimately cause more
802- harm than good!
803- 
804- The server, async & serial programs are available in full here:
805- https://github.com/anordin95/a-conceptual-overview-of-asyncio/tree/main/barebones-network-io-example.
806- 
807- .. _which_concurrency_do_I_want :
808- 
809- ------------------------------ 
810- Which concurrency do I want
811- ------------------------------ 
812- 
813- =========================== 
814- multiprocessing
815- =========================== 
816- 
817- For any computationally bound work in Python, you likely want to use
818- multiprocessing.
819- Otherwise, the Global Interpreter Lock (GIL) will generally get in your way!
820- For those who don't know, the GIL is a lock which ensures only one Python
821- instruction is executed at a time.
822- Of course, since processes are generally entirely independent from one another,
823- the GIL in one process won't be impeded by the GIL in another process.
824- Granted, I believe there are ways to also get around the GIL in a single process
825- by leveraging C extensions or via subinterpreters.
826- 
827- =========================== 
828- multithreading & asyncio
829- =========================== 
830- 
831- Multithreading and asyncio are much more similar in where they're useful with
832- Python: not at all for computationally-bound work, and crucially for I/O bound
833- work.
834- For applications that need to manage absolutely tons of distinct I/O connections
835- or chunks-of-work, asyncio is a must.
836- For example, a web server handling thousands of requests "simultaneously"
837- (in quotes, because, as we saw, the frequent handoffs of control only create
838- the illusion of simultaneous execution).
839- Otherwise, I think the choice between which to use is somewhat down to taste.
840- 
841- Multithreading maintains an OS managed thread for each chunk of work. Whereas
842- asyncio uses Tasks for each work-chunk and manages them via the event-loop's
843- queue.
844- I believe the marginal overhead of one more chunk of work is a fair bit lower
845- for asyncio than threads, which matters a lot for applications that need to
846- manage many, many chunks of work.
847- 
848- There are some other benefits associated with using asyncio.
849- One is clearer visibility into when and where interleaving occurs.
850- The code chunk between two awaits is certainly synchronous.
851- Another is simpler debugging, since it's easier to attach and follow a trace and
852- reason about code execution.
853- With threading, the interleaving is more of a black-box.
854- One benefit of multithreading is not really having to worry about greedy threads
855- hogging execution, something that could happen with asyncio where a greedy
856- coroutine never awaits and effectively stalls the event-loop.
0 commit comments