@@ -595,6 +595,178 @@ with C code to initialize global variables.
595595The actual ``lib.*() `` function calls should be obvious: it's like C.
596596
597597
598+ .. _thread-safety :
599+
600+ Thread Safety
601+ -------------
602+
603+ Multithreading can be a powerful but tricky way to exploit the many cores on
604+ modern CPUs. Combining CFFI with the Python `threading ` module is a convenient
605+ way to use multithreaded parallelism with a C library.
606+
607+ On the GIL-enabled build, CFFI will release the GIL before calling into a C
608+ library. That means that it is possible to get multithreaded speedups using CFFI
609+ on both the free-threaded and GIL-enabled builds of Python. However, that also
610+ means that the GIL does not protect multithreaded shared use of C data
611+ structures exposed via FFI.
612+
613+ If the C library you are wrapping is not thread-safe, then it is not thread-safe
614+ to use the library via Python without adding some kind of locking. If the
615+ library *is * thread-safe, then no additional locking is necessary to ensure the
616+ thread safety of CFFI itself.
617+
618+ Let's make that concrete by wrapping some code that is not thread-safe due to
619+ use of a C global variable:
620+
621+ .. code-block :: python
622+
623+ from cffi import FFI
624+ ffibuilder = FFI()
625+
626+ ffibuilder.set_source(" _thread_safety_example" ,
627+ r """
628+ # include <stdint.h>
629+
630+ static int64_t value = 0;
631+ static int64_t increment( void) {
632+ value++ ;
633+ return value;
634+ }
635+ """ ,
636+ libraries = []
637+ )
638+
639+ ffibuilder.cdef(r """
640+ int64_t increment( void) ;
641+ """
642+ )
643+
644+ if __name__ == " __main__" :
645+ ffibuilder.compile(verbose = True )
646+
647+ The way that the ``increment `` uses the ``value `` global variable is not
648+ thread-safe. `Data races
649+ <https://en.wikipedia.org/wiki/Race_condition#Data_race> `_ are possible if two
650+ threads simultaneously call ``increment ``. We can engineer that situation with a
651+ Python script that calls into the wrapper like so:
652+
653+ .. code-block :: python
654+
655+ import sys
656+
657+ from concurrent.futures import ThreadPoolExecutor, wait
658+ import threading
659+
660+ from _thread_safety_example import ffi, lib
661+
662+ # Make races more likely by switching threads more often
663+ # on the GIL-enabled build. This has no effect on the
664+ # free-threaded build.
665+ sys.setswitchinterval(.0000001 )
666+
667+ N_WORKERS = 4
668+
669+ l = threading.Lock()
670+
671+ def work ():
672+ lib.increment()
673+
674+ def run_thread_pool ():
675+ with ThreadPoolExecutor(max_workers = N_WORKERS ) as tpe:
676+ try :
677+ futures = [tpe.submit(work) for _ in range (100000 )]
678+ # block until all work finishes
679+ wait(futures)
680+ finally :
681+ # check for exceptions in worker threads
682+ [f.result() for f in futures]
683+
684+
685+ run_thread_pool()
686+
687+ print (lib.increment())
688+
689+ On the system used to run this example by the author, this script prints random
690+ results, with possible result values ranging from 99960 to 99980, indicating
691+ that, on average, races happen a few dozen times over the hundred thousand loop
692+ iterations. The results you get will depend on your hardware, system
693+ configuration, and Python interpreter version.
694+
695+ Note that races are relatively rare. The CFFI bindings and Python interpreter
696+ add enough overhead that it is not very likely for two threads to simultaneously
697+ increment the static integer. This can make code *appear * to be sequentially
698+ consistent for small sample sizes, when it is in fact not consistent. See `this
699+ tutorial
700+ <https://github.com/facebookincubator/ft_utils/blob/main/docs/fine_grained_synchronization.md#understanding-the-gil> `_
701+ for more examples of how the GIL and Python overhead can mask thread safety
702+ issues that only manifest under production load.
703+
704+ We can make the above example script thread-safe by using a lock:
705+
706+ .. code-block :: python
707+
708+ l = threading.Lock()
709+
710+ def work ():
711+ l.acquire()
712+ lib.increment()
713+ l.release()
714+
715+ The `threading.Lock ` ensures only one thread can call into the wrapped C library
716+ at a time. Any thread that calls ``l.acquire() `` while another thread has
717+ already acquired the lock will block until the lock is released.
718+
719+ Using a global lock like this is necessary if it is not safe for more than one
720+ thread to simultaneously call into any part of the library. This is the case if
721+ the library relies on global state that does not have any explicit
722+ synchronization. Libraries like this are not `re-entrant
723+ <https://en.wikipedia.org/wiki/Reentrancy_(computing)> `_.
724+
725+ Libraries that are re-entrant but not thread-safe are usually structured such
726+ that two threads can simultaneously use the library so long as the threads do
727+ not simultaneously mutate shared references to an object. For libraries like
728+ this you will want to use a per-object lock instead of a global lock. Keep in
729+ mind in this case that any program with more than one lock can lead to a
730+ `deadlock <https://en.wikipedia.org/wiki/Deadlock_(computer_science) >`_ and care
731+ must be taken to avoid situations where two threads can deadlock.
732+
733+ If it is a programming error for two threads to simultaneously share an object,
734+ you might acquire a `threading.Lock ` object named ``l `` like this:
735+
736+ .. code-block :: python
737+
738+ if not l.acquire(blocking = False ):
739+ raise RuntimeError (" Multithreaded use is not supported" )
740+
741+ # call into the unsafe library or use an unsafe object
742+
743+ l.release()
744+
745+ This prevents deadlocks, since `l.acquire(blocking=False) ` returns `False `
746+ immediately if the lock is already acquired by another thread.
747+
748+ If you know that the C library you are wrapping is thread-safe, no additional
749+ locking is necessary to make the CFFI bindings thread-safe. Please report thread
750+ safety bugs that you suspect are due to issues in the generated CFFI bindings.
751+
752+ If you publish CFFI bindings for a library, you should document the thread
753+ safety guarantees of your bindings. It may make sense to add locking into the
754+ bindings but it might also make sense to clearly document the bindings are not
755+ thread-safe and it is up to users to ensure appropriate synchronization or
756+ exclusive access if users do want to use the bindings in a thread pool.
757+
758+ See the Python free-threading guide page on `improving the thread safety of
759+ Python code
760+ <https://py-free-threading.github.io/porting/#thread-safety-of-pure-python-code> `_
761+ for more information about updating a Python library with thread safety in mind.
762+
763+ You can validate the thread safety of your library by running multithreaded
764+ tests using `Thread Sanitizer
765+ <https://clang.llvm.org/docs/ThreadSanitizer.html> `_. See the Python
766+ free-threading guide page on `using Thread Sanitizer to detect thread safety
767+ issues <https://py-free-threading.github.io/thread_sanitizer/> `_ for more
768+ details.
769+
598770.. _abi-versus-api :
599771
600772ABI versus API
0 commit comments