Skip to content

go/worker/storage/committee: Fix teardown#6444

Merged
martintomazic merged 1 commit intomasterfrom
martin/fix/storage-commitee-worker-teardown
Feb 10, 2026
Merged

go/worker/storage/committee: Fix teardown#6444
martintomazic merged 1 commit intomasterfrom
martin/fix/storage-commitee-worker-teardown

Conversation

@martintomazic
Copy link
Contributor

@martintomazic martintomazic commented Jan 18, 2026

https://github.com/oasisprotocol/internal-ops/issues/1317#issuecomment-3765815368 showed that in case of corrupted storage, our worker teardown might get stuck.

How to test

Start your node with paratime configured and return a dummy error here.

Prior to this change indexer would continue, whilst storage worker would get stuck at teardown.

It would be nice to have a test for this, but we would need to completely refactor storage worker first. Mainly all the state DB, p2p and other stuff should be passed as parameters, so that errors can be mocked in the "integration" tests.

@netlify
Copy link

netlify bot commented Jan 18, 2026

Deploy Preview for oasisprotocol-oasis-core canceled.

Name Link
🔨 Latest commit 9663948
🔍 Latest deploy log https://app.netlify.com/projects/oasisprotocol-oasis-core/deploys/698a7591c04d1200083ffb2e

@martintomazic martintomazic force-pushed the martin/fix/storage-commitee-worker-teardown branch from c5b803a to 1eeb529 Compare January 18, 2026 22:52
@martintomazic martintomazic marked this pull request as ready for review January 19, 2026 08:58
@peternose
Copy link
Collaborator

Start your node with paratime configured and return a dummy error here.

Yes, any error returned in the last for loop can cause this problem.

@martintomazic martintomazic force-pushed the martin/fix/storage-commitee-worker-teardown branch from 1eeb529 to 3de9287 Compare January 28, 2026 13:48
@martintomazic martintomazic force-pushed the martin/fix/storage-commitee-worker-teardown branch from 3de9287 to d6d81b2 Compare February 9, 2026 10:11
Previously the fetch pool was closed first, which caused doneCh
to never be closed, which cause wg.Wait to never finish.

There is no need for the doneCh as fetchPool.Stop already ensures
all worker threads have stopped, and the queue of pending tasks
is emptied. In other words we are guranteed no side effects.
@martintomazic martintomazic force-pushed the martin/fix/storage-commitee-worker-teardown branch from d6d81b2 to 9663948 Compare February 10, 2026 00:02
@martintomazic martintomazic merged commit 081b777 into master Feb 10, 2026
5 checks passed
@martintomazic martintomazic deleted the martin/fix/storage-commitee-worker-teardown branch February 10, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments