Skip to content

fix(orchestrator): ensure input tempfile cleanup on all exit paths#3888

Closed
chengyixu wants to merge 1 commit intolivepeer:masterfrom
chengyixu:fix/tempfile-cleanup-on-panic
Closed

fix(orchestrator): ensure input tempfile cleanup on all exit paths#3888
chengyixu wants to merge 1 commit intolivepeer:masterfrom
chengyixu:fix/tempfile-cleanup-on-panic

Conversation

@chengyixu
Copy link
Copy Markdown

Fixes #1716

/claim #1716

Problem

When Transcode() returns an UnrecoverableError, transcodeSeg panics (line 749). This panic bypasses the explicit os.Remove(fname) call at line 799, leaving *.tempfile input files on disk. Over time, this fills up the orchestrator's data directory.

The terr() error helper also cleaned up the file, but it is never reached on the panic path.

Solution

Move the input tempfile cleanup into a deferred function at the top of transcodeSeg so it executes on all exit paths:

  • Normal return (successful transcode)
  • Error return via terr()
  • Panic from UnrecoverableError

The existing keepInput flag (for preserving input when extremely large outputs are detected) is hoisted to function scope so the defer can check it.

Changes

  • core/orchestrator.go: Add defer cleanup for fnamep; remove redundant cleanup from terr() and the explicit os.Remove block
  • core/livepeernode_test.go: Add two regression tests:
    • TestTranscodeSegCleansTempfileOnUnrecoverableError — triggers panic path, asserts no leftover tempfile
    • TestTranscodeSegCleansTempfileOnNormalError — triggers normal error path, asserts no leftover tempfile

Testing

  • Added regression tests covering both panic and normal error cleanup paths
  • Existing TestTranscodeAndBroadcast continues to pass (deferred cleanup handles the normal path)
  • CI workflows should validate full test suite

Move the input *.tempfile cleanup into a deferred function so it runs on
all exit paths including panics from UnrecoverableError. Previously, when
Transcode() returned an UnrecoverableError the panic at line 749 bypassed
the explicit os.Remove call, causing tempfiles to accumulate in the
orchestrator's work directory over time.

Changes:
- Add deferred cleanup for fnamep that runs on normal return, error,
  and panic paths
- Preserve existing keepInput behavior for extremely large outputs
- Remove now-redundant cleanup from terr() helper and explicit os.Remove
- Add regression tests for both panic and normal error cleanup paths

Fixes livepeer#1716
@github-actions github-actions bot added the go Pull requests that update Go code label Mar 29, 2026
@j0sh
Copy link
Copy Markdown
Collaborator

j0sh commented Apr 1, 2026

We want to keep the tempfile on error for later inspection.

@j0sh j0sh closed this Apr 1, 2026
@github-project-automation github-project-automation bot moved this from Triage to Done in Engineering Roadmap Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update Go code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

.tempfile files not removed if there is a transcoding error

2 participants