Skip to content

Parallel import of modules causes a ModuleNotFoundError #130094

@eendebakpt

Description

@eendebakpt

Bug report

Bug description:

Using the free-threading build parallel import of modules causes a ModuleNotFoundError. The module that cannot be found can differ. A minimal reproducer:

import sys
from threading import Thread, Barrier
from importlib import import_module, reload
import time


# Setup threads
t0 = time.perf_counter()
number_of_threads = 8
barrier = Barrier(number_of_threads)

pmods = []


def work(ii):
    # Method doing the actual import
    barrier.wait()
    while True:
        try:
            m = pmods.pop()
            mod = import_module(m)
            # print(f'  {ii}: {m} done')
        except IndexError:
            return


worker_threads = []
for ii in range(number_of_threads):
    worker_threads.append(Thread(target=work, args=[ii]))

dt = time.perf_counter() - t0
print(f"setup threads {dt*1e3:.2f}")


def parallel_import(modules: list[str]):
    global pmods
    pmods += modules
    for t in worker_threads:
        t.start()
    for t in worker_threads:
        t.join()


def seq_import(modules: list[str]):
    for m in modules:
        mod = import_module(m)


import_method = seq_import
import_method = parallel_import

mods = [
    "abc",
    "functools",
    "weakref",
    "linecache",
    "glob",
    "annotationlib",
    "argparse",
    "fnmatch",
    "itertools",
    "operator",
    "string",
    "re",
    "collections",
    "sqlite3",
    "pathlib",
    "urllib",
    "typing",
    "csv",
    "uuid",
]

mods += [
    "setuptools",
    "sympy",
    "django",
    "boto3",
]  # these are some packages that need to be installed. without these the reproduction is the issue is much harder

t0 = time.perf_counter()
import_method(mods)
dt = time.perf_counter() - t0
print(f"{import_method.__name__} {dt*1e3:.2f}")

Note: this script fails on my system about 1 in 4 times. Changing the modules imported (or the order, or the number of threads) can effect this.

Modules that have given issues: collections.abs, glob, sympy, django.utils.regex_helper. This suggests it is a general issue, and not a particular package.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Windows

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions