Skip to content

os.path.isfile returns False for a file that exists in a multithreaded workflow #140054

@ngoldbaum

Description

@ngoldbaum

Consider the following test script, which I've adapted from a NumPy test. We are working on running the NumPy unit tests under pytest-run-parallel to find thread safety issues.

from concurrent.futures import ThreadPoolExecutor
import threading
import os
from pathlib import Path


def run_threaded(
    func,
    num_threads=8,
    pass_count=False,
    pass_barrier=False,
    outer_iterations=1,
    prepare_args=None,
):
    """Runs a function many times in parallel"""
    for _ in range(outer_iterations):
        with ThreadPoolExecutor(max_workers=num_threads) as tpe:
            if prepare_args is None:
                args = []
            else:
                args = prepare_args()
            if pass_barrier:
                barrier = threading.Barrier(num_threads)
                args.append(barrier)
            if pass_count:
                all_args = [(func, i, *args) for i in range(num_threads)]
            else:
                all_args = [(func, *args) for i in range(num_threads)]
            try:
                futures = []
                for arg in all_args:
                    futures.append(tpe.submit(*arg))
            finally:
                if len(futures) < num_threads and pass_barrier:
                    barrier.abort()
            for f in futures:
                f.result()

ROOT = Path(__file__).parents[0]
FILES = [
    ROOT / "py.typed",
    ROOT / "__init__.pyi",
    ROOT / "ctypeslib" / "__init__.pyi",
    ROOT / "_core" / "__init__.pyi",
    ROOT / "f2py" / "__init__.pyi",
    ROOT / "fft" / "__init__.pyi",
    ROOT / "lib" / "__init__.pyi",
    ROOT / "linalg" / "__init__.pyi",
    ROOT / "ma" / "__init__.pyi",
    ROOT / "matrixlib" / "__init__.pyi",
    ROOT / "polynomial" / "__init__.pyi",
    ROOT / "random" / "__init__.pyi",
    ROOT / "testing" / "__init__.pyi",
]

def closure(b):
    """Test if all ``.pyi`` files are properly installed."""
    b.wait()
    for file in FILES:
        assert(os.path.isfile(file)), file

run_threaded(closure, num_threads=4, pass_barrier=True)

You can set up the necessary directory structure either by unzipping the attached tarball or inserting the following code before the run_threaded call at the end of the script:

for file in FILES:
    file.parent.mkdir(exist_ok=True, parents=True)
    file.touch()

On my Arm64 M3 Macbook Pro, running free-threaded CPython 3.14.0, the script sometimes fails with the following assertion error:

Traceback (most recent call last):
  File "/Users/goldbaum/Documents/test/cpython_bug/test.py", line 62, in <module>
    run_threaded(closure, num_threads=4, pass_barrier=True)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/goldbaum/Documents/test/cpython_bug/test.py", line 37, in run_threaded
    f.result()
    ~~~~~~~~^^
  File "/Users/goldbaum/.pyenv/versions/3.14.0t/lib/python3.14t/concurrent/futures/_base.py", line 443, in result
    return self.__get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/Users/goldbaum/.pyenv/versions/3.14.0t/lib/python3.14t/concurrent/futures/_base.py", line 395, in __get_result
    raise self._exception
  File "/Users/goldbaum/.pyenv/versions/3.14.0t/lib/python3.14t/concurrent/futures/thread.py", line 86, in run
    result = ctx.run(self.task)
  File "/Users/goldbaum/.pyenv/versions/3.14.0t/lib/python3.14t/concurrent/futures/thread.py", line 73, in run
    return fn(*args, **kwargs)
  File "/Users/goldbaum/Documents/test/cpython_bug/test.py", line 60, in closure
    assert(os.path.isfile(file)), file
           ~~~~~~~~~~~~~~^^^^^^
AssertionError: /Users/goldbaum/Documents/test/cpython_bug/ma/__init__.pyi

Sometimes the AssertionError has a context of . instead of a directory in the FILES list.

The error doesn't happen if I call run_threaded with num_threads=1. I also am able to avoid the error by adding print(FILES) before the call to run_threaded.

Maybe this is a pathlib caching thread safety bug?

cpython_bug.tar.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions