Multiprocessing leads to Module Not Found error if absolute imports are used in modules

I have a github [repo](https://github.com/kitzeslab/bioacoustics-model-zoo) where I define some models and a hubconf.py file for access via the torch.hub.load() API. The models work fine when there is no multi-processing (ie num_workers=0 for dataloader), but fail with an error about pickling and Module Not Found if two conditions are true: (1) num_workers>0 and (2) the module that the model object is defined in uses an absolute import for another module in my repo.

For example, with the structure:
```
my_repo/
    utils.py
    a/
        model_a.py # contains class ModelA
    ...
```

if model_a.py has `from my_repo import utils`, and we import ModelA and try to use >0 workers in a DataLoader, error looks like this:
```
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File ".../python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File ".../python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'my_repo'
```
This makes some sense, because we never installed my_repo as a package when loading ModelA - somewhere, it seems multiprocessing tries to recreate or reimport things and does not find my_repo. 

Edit: I thought relative imports might be a workaround, but they don't fix the issue

Is there a solution to this? Thanks! 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Multiprocessing leads to Module Not Found error if absolute imports are used in modules #353

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Multiprocessing leads to Module Not Found error if absolute imports are used in modules #353

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions