Skip to content

Commit e94a7b6

Browse files
ngoldbaummattip
andauthored
Add CFFI thread safety docs (#188)
* Add CFFI thread safety docs * Delete incorrect statements * Add more links, examples, and suggestions about TSan * fix indentation in code example * Update doc/source/overview.rst Co-authored-by: Matti Picus <[email protected]> --------- Co-authored-by: Matti Picus <[email protected]>
1 parent 24e42cb commit e94a7b6

File tree

1 file changed

+172
-0
lines changed

1 file changed

+172
-0
lines changed

doc/source/overview.rst

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -595,6 +595,178 @@ with C code to initialize global variables.
595595
The actual ``lib.*()`` function calls should be obvious: it's like C.
596596

597597

598+
.. _thread-safety:
599+
600+
Thread Safety
601+
-------------
602+
603+
Multithreading can be a powerful but tricky way to exploit the many cores on
604+
modern CPUs. Combining CFFI with the Python `threading` module is a convenient
605+
way to use multithreaded parallelism with a C library.
606+
607+
On the GIL-enabled build, CFFI will release the GIL before calling into a C
608+
library. That means that it is possible to get multithreaded speedups using CFFI
609+
on both the free-threaded and GIL-enabled builds of Python. However, that also
610+
means that the GIL does not protect multithreaded shared use of C data
611+
structures exposed via FFI.
612+
613+
If the C library you are wrapping is not thread-safe, then it is not thread-safe
614+
to use the library via Python without adding some kind of locking. If the
615+
library *is* thread-safe, then no additional locking is necessary to ensure the
616+
thread safety of CFFI itself.
617+
618+
Let's make that concrete by wrapping some code that is not thread-safe due to
619+
use of a C global variable:
620+
621+
.. code-block:: python
622+
623+
from cffi import FFI
624+
ffibuilder = FFI()
625+
626+
ffibuilder.set_source("_thread_safety_example",
627+
r"""
628+
#include <stdint.h>
629+
630+
static int64_t value = 0;
631+
static int64_t increment(void) {
632+
value++;
633+
return value;
634+
}
635+
""",
636+
libraries=[]
637+
)
638+
639+
ffibuilder.cdef(r"""
640+
int64_t increment(void);
641+
"""
642+
)
643+
644+
if __name__ == "__main__":
645+
ffibuilder.compile(verbose=True)
646+
647+
The way that the ``increment`` uses the ``value`` global variable is not
648+
thread-safe. `Data races
649+
<https://en.wikipedia.org/wiki/Race_condition#Data_race>`_ are possible if two
650+
threads simultaneously call ``increment``. We can engineer that situation with a
651+
Python script that calls into the wrapper like so:
652+
653+
.. code-block:: python
654+
655+
import sys
656+
657+
from concurrent.futures import ThreadPoolExecutor, wait
658+
import threading
659+
660+
from _thread_safety_example import ffi, lib
661+
662+
# Make races more likely by switching threads more often
663+
# on the GIL-enabled build. This has no effect on the
664+
# free-threaded build.
665+
sys.setswitchinterval(.0000001)
666+
667+
N_WORKERS = 4
668+
669+
l = threading.Lock()
670+
671+
def work():
672+
lib.increment()
673+
674+
def run_thread_pool():
675+
with ThreadPoolExecutor(max_workers=N_WORKERS) as tpe:
676+
try:
677+
futures = [tpe.submit(work) for _ in range(100000)]
678+
# block until all work finishes
679+
wait(futures)
680+
finally:
681+
# check for exceptions in worker threads
682+
[f.result() for f in futures]
683+
684+
685+
run_thread_pool()
686+
687+
print(lib.increment())
688+
689+
On the system used to run this example by the author, this script prints random
690+
results, with possible result values ranging from 99960 to 99980, indicating
691+
that, on average, races happen a few dozen times over the hundred thousand loop
692+
iterations. The results you get will depend on your hardware, system
693+
configuration, and Python interpreter version.
694+
695+
Note that races are relatively rare. The CFFI bindings and Python interpreter
696+
add enough overhead that it is not very likely for two threads to simultaneously
697+
increment the static integer. This can make code *appear* to be sequentially
698+
consistent for small sample sizes, when it is in fact not consistent. See `this
699+
tutorial
700+
<https://github.com/facebookincubator/ft_utils/blob/main/docs/fine_grained_synchronization.md#understanding-the-gil>`_
701+
for more examples of how the GIL and Python overhead can mask thread safety
702+
issues that only manifest under production load.
703+
704+
We can make the above example script thread-safe by using a lock:
705+
706+
.. code-block:: python
707+
708+
l = threading.Lock()
709+
710+
def work():
711+
l.acquire()
712+
lib.increment()
713+
l.release()
714+
715+
The `threading.Lock` ensures only one thread can call into the wrapped C library
716+
at a time. Any thread that calls ``l.acquire()`` while another thread has
717+
already acquired the lock will block until the lock is released.
718+
719+
Using a global lock like this is necessary if it is not safe for more than one
720+
thread to simultaneously call into any part of the library. This is the case if
721+
the library relies on global state that does not have any explicit
722+
synchronization. Libraries like this are not `re-entrant
723+
<https://en.wikipedia.org/wiki/Reentrancy_(computing)>`_.
724+
725+
Libraries that are re-entrant but not thread-safe are usually structured such
726+
that two threads can simultaneously use the library so long as the threads do
727+
not simultaneously mutate shared references to an object. For libraries like
728+
this you will want to use a per-object lock instead of a global lock. Keep in
729+
mind in this case that any program with more than one lock can lead to a
730+
`deadlock <https://en.wikipedia.org/wiki/Deadlock_(computer_science)>`_ and care
731+
must be taken to avoid situations where two threads can deadlock.
732+
733+
If it is a programming error for two threads to simultaneously share an object,
734+
you might acquire a `threading.Lock` object named ``l`` like this:
735+
736+
.. code-block:: python
737+
738+
if not l.acquire(blocking=False):
739+
raise RuntimeError("Multithreaded use is not supported")
740+
741+
# call into the unsafe library or use an unsafe object
742+
743+
l.release()
744+
745+
This prevents deadlocks, since `l.acquire(blocking=False)` returns `False`
746+
immediately if the lock is already acquired by another thread.
747+
748+
If you know that the C library you are wrapping is thread-safe, no additional
749+
locking is necessary to make the CFFI bindings thread-safe. Please report thread
750+
safety bugs that you suspect are due to issues in the generated CFFI bindings.
751+
752+
If you publish CFFI bindings for a library, you should document the thread
753+
safety guarantees of your bindings. It may make sense to add locking into the
754+
bindings but it might also make sense to clearly document the bindings are not
755+
thread-safe and it is up to users to ensure appropriate synchronization or
756+
exclusive access if users do want to use the bindings in a thread pool.
757+
758+
See the Python free-threading guide page on `improving the thread safety of
759+
Python code
760+
<https://py-free-threading.github.io/porting/#thread-safety-of-pure-python-code>`_
761+
for more information about updating a Python library with thread safety in mind.
762+
763+
You can validate the thread safety of your library by running multithreaded
764+
tests using `Thread Sanitizer
765+
<https://clang.llvm.org/docs/ThreadSanitizer.html>`_. See the Python
766+
free-threading guide page on `using Thread Sanitizer to detect thread safety
767+
issues <https://py-free-threading.github.io/thread_sanitizer/>`_ for more
768+
details.
769+
598770
.. _abi-versus-api:
599771

600772
ABI versus API

0 commit comments

Comments
 (0)