-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
Description
Describe the issue:
I get failures near the beginning of sampling when running models on Linux. They are coming from the multiprocessing library. I can usually work around them by simply restarting the sampler, or using a different seed, but they are happening with increasing frequency in v5.16.
Reproduceable code example:
The model in the following notebook fails every time on my Linux laptop:
https://gist.github.com/fonnesbeck/d4b8da1f74a1a790892d774b7484ecfaError message:
---------------------------------------------------------------------------
ConnectionResetError Traceback (most recent call last)
Cell In[13], [line 2](vscode-notebook-cell:?execution_count=13&line=2)
[1](vscode-notebook-cell:?execution_count=13&line=1) with model:
----> [2](vscode-notebook-cell:?execution_count=13&line=2) trace = pm.sample(random_seed=SEED, step=pm.Metropolis(), tune=5000)
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:846, in sample(draws, tune, chains, cores, random_seed, progressbar, progressbar_theme, step, var_names, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, blas_cores, model, **kwargs)
[844](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:844) _print_step_hierarchy(step)
[845](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:845) try:
--> [846](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:846) _mp_sample(**sample_args, **parallel_args)
[847](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:847) except pickle.PickleError:
[848](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:848) _log.warning("Could not pickle model, sampling singlethreaded.")
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1259, in _mp_sample(draws, tune, step, chains, cores, random_seed, start, progressbar, progressbar_theme, traces, model, callback, blas_cores, mp_ctx, **kwargs)
[1257](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1257) try:
[1258](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1258) with sampler:
-> [1259](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1259) for draw in sampler:
[1260](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1260) strace = traces[draw.chain]
[1261](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/mcmc.py:1261) strace.record(draw.point, draw.stats)
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:471, in ParallelSampler.__iter__(self)
[464](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:464) task = progress.add_task(
[465](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:465) self._desc.format(self),
[466](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:466) completed=self._completed_draws,
[467](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:467) total=self._total_draws,
[468](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:468) )
[470](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:470) while self._active:
--> [471](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:471) draw = ProcessAdapter.recv_draw(self._active)
[472](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:472) proc, is_last, draw, tuning, stats = draw
[473](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:473) self._completed_draws += 1
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:328, in ProcessAdapter.recv_draw(processes, timeout)
[326](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:326) idxs = {id(proc._msg_pipe): proc for proc in processes}
[327](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:327) proc = idxs[id(ready[0])]
--> [328](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:328) msg = ready[0].recv()
[330](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:330) if msg[0] == "error":
[331](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/site-packages/pymc/sampling/parallel.py:331) old_error = msg[1]
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:250, in _ConnectionBase.recv(self)
[248](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:248) self._check_closed()
[249](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:249) self._check_readable()
--> [250](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:250) buf = self._recv_bytes()
[251](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:251) return _ForkingPickler.loads(buf.getbuffer())
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:430, in Connection._recv_bytes(self, maxsize)
[429](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:429) def _recv_bytes(self, maxsize=None):
--> [430](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:430) buf = self._recv(4)
[431](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:431) size, = struct.unpack("!i", buf.getvalue())
[432](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:432) if size == -1:
File ~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:395, in Connection._recv(self, size, read)
[393](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:393) remaining = size
[394](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:394) while remaining > 0:
--> [395](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:395) chunk = read(handle, remaining)
[396](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:396) n = len(chunk)
[397](https://file+.vscode-resource.vscode-cdn.net/var/home/fonnesbeck/repos/pymc-examples/examples/mixture_models/~/repos/pymc-examples/.pixi/envs/default/lib/python3.12/multiprocessing/connection.py:397) if n == 0:
ConnectionResetError: [Errno 104] Connection reset by peer
### PyMC version information:
PyMC 5.16.2
PyTensor 2.25.4
Python 3.12.5
OS Fedora 40