-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Summary
The translator proxy fails with JobIdNotFound error when multiple SV1 miners connect simultaneously while running in aggregate_channels = true mode. This causes the translator to disconnect from the upstream pool and fail over, eventually shutting down.
Root Cause
In aggregated mode, when SetNewPrevHash is processed, the translator iterates over all channels in extended_channels and calls on_set_new_prev_hash() on each one:
// miner-apps/translator/src/lib/sv2/channel_manager/mining_message_handler.rs
for (_, channel) in channel_manager_data.extended_channels.iter() {
let mut channel = channel.write()...;
channel.on_set_new_prev_hash(m_static.clone())...; // Called on ALL channels
}The on_set_new_prev_hash() method in channels_sv2::client::extended::ExtendedChannel removes the job from future_jobs and then clears all remaining future jobs:
// stratum/protocols/v2/channels-sv2/src/client/extended.rs
match self.future_jobs.remove(&set_new_prev_hash.job_id) {
Some(mut activated_job) => { ... }
None => { return Err(ExtendedChannelError::JobIdNotFound); }
}
self.future_jobs.clear(); // Clears ALL future jobsThe problem occurs with the following sequence:
- First
OpenExtendedMiningChannelSuccess(channel_id=2) createsextended_channels[2] - First
NewExtendedMiningJob(job_id=1) stores job inextended_channels[2].future_jobs - First
SetNewPrevHash(job_id=1) activates job on channel_id=2, clearingfuture_jobs - Second
OpenExtendedMiningChannelSuccess(channel_id=3) createsextended_channels[3] - Second
NewExtendedMiningJob(job_id=1) stores job in ALL channels (2 and 3) - Second
SetNewPrevHash(job_id=1) iterates ALL channels:- Tries to activate job_id=1 on channel_id=2 → FAILS (job was just re-added but gets processed first)
- Never gets to channel_id=3
Steps to Reproduce
- Configure translator with
aggregate_channels = true - Start pool with Template Provider
- Start translator connecting to pool
- Connect multiple SV1 miners simultaneously (e.g., 5-10 miners connecting within ~100ms)
Expected Behavior
All miners should connect successfully and receive mining jobs.
Actual Behavior
2026-01-29T16:25:19.528350Z INFO Received: SetNewPrevHash(channel_id=3, job_id=1, ...)
2026-01-29T16:25:19.528422Z ERROR Failed to set new prev hash: JobIdNotFound
2026-01-29T16:25:19.528514Z WARN Upstream connection dropped: FailedToProcessSetNewPrevHash
The translator disconnects and eventually shuts down after exhausting retry attempts.
Proposed Fix
Before calling on_set_new_prev_hash() on each channel, check if the channel actually has the referenced job in its future_jobs. Skip channels that don't have the job:
for (_, channel) in channel_manager_data.extended_channels.iter() {
let mut channel = channel.write()...;
// Skip channels that don't have this job as a future job
if !channel.get_future_jobs().contains_key(&m_static.job_id) {
continue;
}
channel.on_set_new_prev_hash(m_static.clone())...;
}Environment
- sv2-apps: main branch
- stratum-core: main branch
- Configuration:
aggregate_channels = true
Logs
Log 1: Multiple miners connecting simultaneously (10 miners)
2026-01-29T16:25:17.108993Z INFO translator_sv2: Starting Translator Proxy...
2026-01-29T16:25:18.111103Z INFO translator_sv2::sv2::upstream::upstream: Connected to upstream at 127.0.0.1:3336
2026-01-29T16:25:19.522888Z INFO translator_sv2::sv1::sv1_server::sv1_server: New SV1 downstream connection from 10.30.76.55:38518
2026-01-29T16:25:19.523029Z INFO translator_sv2::sv1::sv1_server::sv1_server: Downstream 1 registered successfully
2026-01-29T16:25:19.523109Z INFO translator_sv2::sv1::sv1_server::sv1_server: New SV1 downstream connection from 10.30.207.44:57182
2026-01-29T16:25:19.523181Z INFO translator_sv2::sv1::sv1_server::sv1_server: Downstream 2 registered successfully
...
2026-01-29T16:25:19.526184Z INFO Received: OpenExtendedMiningChannelSuccess(request_id: 1, channel_id: 2, ...)
2026-01-29T16:25:19.527239Z INFO Received: NewExtendedMiningJob(channel_id: 2, job_id: 1, ...)
2026-01-29T16:25:19.527519Z INFO Received: SetNewPrevHash(channel_id=2, job_id=1, ...)
2026-01-29T16:25:19.527655Z INFO Received: OpenExtendedMiningChannelSuccess(request_id: 2, channel_id: 3, ...)
2026-01-29T16:25:19.528017Z INFO Received: NewExtendedMiningJob(channel_id: 3, job_id: 1, ...)
2026-01-29T16:25:19.528350Z INFO Received: SetNewPrevHash(channel_id=3, job_id=1, ...)
2026-01-29T16:25:19.528422Z ERROR Failed to set new prev hash: JobIdNotFound
2026-01-29T16:25:19.528514Z WARN Upstream connection dropped: FailedToProcessSetNewPrevHash
2026-01-29T16:25:19.529520Z ERROR All upstreams failed after 3 retries each
Log 2: Similar failure with 6 miners
2026-01-29T16:06:52.129312Z INFO translator_sv2: Starting Translator Proxy...
2026-01-29T16:06:53.259420Z INFO Connected to upstream at 127.0.0.1:3336
2026-01-29T16:06:56.492618Z INFO New SV1 downstream connection from 10.30.76.59:33732
2026-01-29T16:06:56.492774Z INFO Downstream 1 registered successfully
...
2026-01-29T16:06:56.495542Z INFO Received: OpenExtendedMiningChannelSuccess(request_id: 1, channel_id: 2, ...)
2026-01-29T16:06:56.495674Z INFO Received: NewExtendedMiningJob(channel_id: 2, job_id: 1, ...)
2026-01-29T16:06:56.496138Z INFO Received: SetNewPrevHash(channel_id=2, job_id=1, ...)
2026-01-29T16:06:56.496334Z ERROR Failed to set new prev hash: JobIdNotFound
2026-01-29T16:06:56.496479Z WARN Upstream connection dropped: FailedToProcessSetNewPrevHash
2026-01-29T16:06:56.497513Z ERROR All upstreams failed after 3 retries each