Worker processes crash on Manager proxy access after hub completes full exploration pass

Hi!

It seems there may be a bug in FuzzWorkerHub.start() (hypofuzz.py) that causes worker processes to crash with an unhandled exception when HypoFuzz finishes a full exploration pass — i.e., when it prints "Found a failing input for every test!" and exits.

**What is observed**

When all tests have been exhausted, HypoFuzz exits with code 1 and the worker subprocess (Process-2) prints an unhandled traceback. The exception class differs by Python version:

**Python 3.13.8 — BrokenPipeError:**


File "…/hypofuzz/hypofuzz.py", line 694, in start
    worker_state["expected_lifetime"] = None
  File "…/multiprocessing/managers.py", line 830, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  …
BrokenPipeError: [Errno 32] Broken pipe
Found a failing input for every test!

**Python 3.14.3 — FileNotFoundError:**


File "…/hypofuzz/hypofuzz.py", line 610, in start
    self._update_targets(self.shared_state["hub_state"]["nodeids"])
  File "…/multiprocessing/managers.py", line 832, in _callmethod
    kind, result = conn.recv()
  …
  File "…/multiprocessing/managers.py", line 863, in _incref
    conn = self._Client(self._token.address, authkey=self._authkey)
  …
FileNotFoundError: [Errno 2] No such file or directory
Found a failing input for every test!

Both versions reproduce. The traceback always originates from _start_worker → FuzzWorker.start() accessing a Manager proxy object.

**Why this might be happening**

Looking at FuzzWorkerHub.start(), the poll loop breaks when all workers report empty valid_nodeids, and control then exits the with Manager() as manager: block. Our reading of the code suggests that Manager.__exit__() fires at that point — closing the IPC socket — while the worker processes may still be running and attempting to access shared state through proxy objects. That timing gap is what we suspect is behind the crash, though we may be missing something and invite the maintainers to look more carefully.

The difference in exception class between Python versions (write failure on 3.13 vs. connect failure on 3.14) may reflect a change in how Manager.__exit__() cleans up its socket between the two releases, but we are not certain of that either.

**Reproduction**

Reproduces in ~65 seconds with a single always-failing test. Attached is a minimal two-file reproduction:

- test_repro.py — a Hypothesis test that always raises, so HypoFuzz exhausts valid_nodeids on the first hub check cycle (after its 60-second sleep)
- run_repro.sh — runs HypoFuzz against the test on both Python versions and reports the exception class observed on each

**Environment**

- HypoFuzz: 25.11.01
- Hypothesis: 6.148.7 (Python 3.13) / 6.151.9 (Python 3.14)
- Python: 3.13.8 and 3.14.3
- OS: macOS (Apple Silicon)

[test_repro.py](https://github.com/user-attachments/files/26112033/test_repro.py)

[run_repro.sh](https://github.com/user-attachments/files/26113636/run_repro.sh)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worker processes crash on Manager proxy access after hub completes full exploration pass #246

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Worker processes crash on Manager proxy access after hub completes full exploration pass #246

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions