otelcol: make Shutdown block until Run completes cleanup#14596
otelcol: make Shutdown block until Run completes cleanup#14596RoryGlenn wants to merge 1 commit intoopen-telemetry:mainfrom
Conversation
Make Collector.Shutdown block until Run has finished all cleanup, following the http.Server pattern described in issue open-telemetry#4947. The three required behaviors are now implemented: 1. Shutdown is safe to call at any point (already handled by sync.Once) 2. If Shutdown is called before Run, it returns immediately 3. If Shutdown is called while Run is active, it blocks until Run has confirmed shutdown and the state reaches StateClosed Implementation uses a sync.WaitGroup incremented at Run entry and decremented via defer when Run exits. Shutdown waits on the WaitGroup after closing the shutdown channel. Fixes open-telemetry#4947 Assisted-by: Claude Opus 4.6
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a908109aa2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Consecutive calls to Run are not allowed, Run shouldn't be called once a collector is shut down. | ||
| // Sets up the control logic for config reloading and shutdown. | ||
| func (col *Collector) Run(ctx context.Context) error { | ||
| col.runWG.Add(1) |
There was a problem hiding this comment.
Register Run with wait group before callers can Wait
Shutdown can miss an in-flight startup when Run is launched asynchronously (for example go col.Run(ctx) followed immediately by col.Shutdown()), because runWG.Add(1) is executed inside Run after scheduling. In that interleaving, Shutdown observes a zero counter and returns before Run finishes startup/shutdown cleanup, so the new “Shutdown blocks until Run completes” contract is not guaranteed for callers that start Run in a goroutine.
Useful? React with 👍 / 👎.
Description
Make
Collector.Shutdownblock untilRunhas finished all cleanup, following thehttp.Serverpattern.Link to tracking Issue: #4947
Changes
The three required behaviors from the issue are now implemented:
sync.Onceon the shutdown channel (unchanged)sync.WaitGroupcounter is 0, soWait()returns immediatelyShutdownblocks untilRuncompletes all cleanup and the state reachesStateClosedImplementation
Added a
sync.WaitGroup(runWG) to theCollectorstruct:Run()callsrunWG.Add(1)at entry anddefer runWG.Done()to signal completionShutdown()callsrunWG.Wait()after closing the shutdown channel, blocking untilRunfinishesTesting
TestCollectorShutdownBlocksUntilRunComplete— verifiesShutdown()blocks and state isStateClosedwhen it returnsTestCollectorShutdownBeforeRun— verifiesShutdown()returns immediately when called beforeRunTestCollectorShutdownCalledTwiceBlocks— verifies callingShutdown()twice works correctlyTestCollectorStateAfterConfigChange— callsShutdown()in a goroutine to avoid deadlock with the new blocking behaviorAssisted-by: Claude Opus 4.6