Skip to content

Fix deadlock in ResumeTransfers, mmap safety in ScheduleTransfers#3410

Open
pbanakar-microsoft wants to merge 2 commits intomover/c2c-stagefrom
users/pbanakar/resumeJobTest
Open

Fix deadlock in ResumeTransfers, mmap safety in ScheduleTransfers#3410
pbanakar-microsoft wants to merge 2 commits intomover/c2c-stagefrom
users/pbanakar/resumeJobTest

Conversation

@pbanakar-microsoft
Copy link
Copy Markdown
Collaborator

@pbanakar-microsoft pbanakar-microsoft commented Mar 16, 2026

Bug fixes for resume path in azcopy engine:

  • Deadlock fix: Changed ResumeTransfers to use READ lock (via buildmode.IsMover) instead of write lock when iterating jobPartMgrs to queue parts. Write lock caused deadlock when partsChannel fills up: ResumeTransfers holds write lock -> reportJobPartDoneHandler needs read lock -> scheduleJobParts blocks on unbuffered jobPartProgress.
  • Mmap safety: Cached isFinalPart and cachedNumTransfers before the transfer loop in ScheduleTransfers to avoid accessing memory-mapped plan data that may be unmapped during iteration.

@pbanakar-microsoft pbanakar-microsoft force-pushed the users/pbanakar/resumeJobTest branch from f665449 to 48fed89 Compare March 25, 2026 21:44
@pbanakar-microsoft pbanakar-microsoft changed the title Fix deadlock in ResumeTransfers, mmap safety in ScheduleTransfers, resume logging Fix deadlock in ResumeTransfers, mmap safety in ScheduleTransfers Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant