-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
π bugSomething isn't workingSomething isn't working
Description
When you use conv2d from PyTorch with parallel_save it hangs. This was first seen in CI for PR #42 - not because of that PR, but because I fixed the tests, which in the "parallel" test case for conv2d was calling the save method not parallel_save - so this exposed what was a bug with the original implementation.
It's worth noting that the tests pass on macOS, but fail on Linux. If we interrupt it we see the following error:
This process (pid=366448) is multi-threaded, use of fork() may lead to deadlocks in the child.
This is then consistent with the hang that we see. My assumption is that under the hood PyTorch is using parallelism too, and it's a known issue that you shouldn't have children be parents in multiprocessing.
I'll revert the "fix" to the tests in PR #42, and this bug is then that we have to solve this somehow.
- Change the spawn mode for multiprocessing?
- Use PyTorch multiprocessing?
- Detect if conv2d is used in an expression, and if so revert to non-parallel save/sum?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
π bugSomething isn't workingSomething isn't working