Conversation
dialmaster
commented
Feb 21, 2026
- Add pre-flight health check (write+delete probe with timeout) before each download to catch unresponsive or permission-denied output dirs before yt-dlp contaminates the archive
- Replace bare fs.move/ensureDir in post-processor with retry-enabled variants (exponential backoff) to ride out transient NFS errors
- Remove failed videos from yt-dlp archive so they are retried on the next scheduled run instead of appearing permanently "already downloaded"
- Replace synchronous statSync diagnostic with timeout-guarded async stat to avoid hanging on stale NFS mounts during error logging
- Add NFS mount configuration guidance to CONFIG.md
- Add pre-flight health check (write+delete probe with timeout) before each download to catch unresponsive or permission-denied output dirs before yt-dlp contaminates the archive - Replace bare fs.move/ensureDir in post-processor with retry-enabled variants (exponential backoff) to ride out transient NFS errors - Remove failed videos from yt-dlp archive so they are retried on the next scheduled run instead of appearing permanently "already downloaded" - Replace synchronous statSync diagnostic with timeout-guarded async stat to avoid hanging on stale NFS mounts during error logging - Add NFS mount configuration guidance to CONFIG.md
📊 Test Coverage ReportBackend Coverage
Frontend Coverage
Coverage Requirements
Coverage report generated for commit 8d6e6a1 |
PR Review: fix/stale-NFS-mountsThis is a well-motivated fix for a real data-integrity bug. The core approach — pre-flight health check + retry wrappers + archive cleanup — addresses the root causes systematically. Below are my findings, ordered by severity. Issues1. Health check file may be orphaned on NFS timeout (medium)In await Promise.race([
(async () => {
await fsPromises.writeFile(testFile, 'healthcheck'); // succeeds
await fsPromises.unlink(testFile); // hangs → timeout fires → orphan
})(),
timeoutPromise
]);If the NFS mount is slow-but-not-dead, the background unlink will eventually complete. If it's permanently hung, the Additionally, 2. Unreachable
|
- Health check: add best-effort cleanup for orphaned test files when unlink fails after writeFile succeeds (e.g. NFS timeout between the two operations). Use crypto.randomUUID() instead of Date.now() to avoid filename collisions. - Health check: add guard for undefined outputDir to throw a clear error instead of a cryptic TypeError from path.join(undefined, ...). - Retry loops: replace while+lastError pattern with for-loop in both ensureDirWithRetries and moveWithRetries to eliminate unreachable dead code (throw lastError after exhaustive loop). - Archive cleanup: only remove videos from the yt-dlp archive when they were explicitly marked as failed during download, not when they merely lack file size (stat/waitForFile failure). Prevents spurious re-downloads when NFS lag causes stat to fail on files that actually exist on disk. https://claude.ai/code/session_01XKqGTES8nh9U7xerGXHxjW
…mtvr Fix issues raised in PR #440 review
Code ReviewOverall this is solid work that addresses a real problem. The design decisions are sensible and the test coverage is good. A few issues worth discussing before merging: Potential Bug: Double-unlink in
|