mDNS discovery: nodes visible but fail to establish connections

## Description

Running a 4-node M3 Ultra Mac Studio cluster, nodes successfully discover each other via mDNS but fail to establish actual connections for distributed inference.

## Environment
- **OS:** macOS Sonoma 15.x
- **Hardware:** 4x M3 Ultra Mac Studios (192GB RAM each)
- **Network:** Same LAN, mDNS enabled
- **Node names:** cluster-1, cluster-3, cluster-4, cluster-6

## Observed Behavior

1. Start EXO on all 4 nodes
2. Nodes appear in each other's peer lists (mDNS discovery works)
3. When initiating distributed inference, connections between nodes fail/timeout
4. Work is not distributed - only local node processes the request

## Expected Behavior

Once nodes discover each other, they should successfully establish connections and distribute inference workload.

## Debug Observations

When investigating the discovery mechanism in `exo/networking/discovery.rs`, we noticed the TTL configuration:

```rust
Duration::from_secs(2_500)  // This equals ~41 minutes
```

We suspected this might be a typo and should be `from_millis(2_500)` (2.5 seconds), which is more typical for mDNS refresh intervals. After making this change locally, nodes were able to connect successfully.

## Questions

1. Is the 2500-second TTL intentional? Seems very long for dynamic peer discovery
2. Could the long TTL cause stale peer information that breaks connection establishment?
3. Are there recommended network/firewall settings we should verify?

## Workaround

Building from source with modified TTL values appeared to resolve the issue, but want to understand if this is the actual root cause or if something else is happening.

## Logs

Happy to provide debug logs if you can point me to how to enable verbose mDNS/networking logging.

---

Related to closed PR #1297 - opening as issue per maintainer request to investigate further.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mDNS discovery: nodes visible but fail to establish connections #1305

Description

Environment

Observed Behavior

Expected Behavior

Debug Observations

Questions

Workaround

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mDNS discovery: nodes visible but fail to establish connections #1305

Description

Description

Environment

Observed Behavior

Expected Behavior

Debug Observations

Questions

Workaround

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions