Skip to content

Reduce memory footprint for allreduce8 and allgather6#644

Merged
Binyang2014 merged 2 commits intomainfrom
binyli/nccl-fix
Oct 3, 2025
Merged

Reduce memory footprint for allreduce8 and allgather6#644
Binyang2014 merged 2 commits intomainfrom
binyli/nccl-fix

Conversation

@Binyang2014
Copy link
Contributor

Reduce memory footprint for allreduce8 and allgather6
Remove libLoadSucceed check for group start

@Binyang2014 Binyang2014 marked this pull request as ready for review October 3, 2025 20:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR aims to reduce memory footprint for allreduce8 and allgather6 algorithms by optimizing semaphore allocation patterns. The changes move from creating new semaphores for each operation to reusing pre-allocated semaphores.

  • Separates input and output semaphores in Allreduce8 to avoid unnecessary allocations
  • Caches memory semaphores in AllgatherAlgo6 to prevent repeated allocations
  • Removes error checking for library loading in ncclGroupStart to simplify the flow

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
apps/nccl/src/nccl.cu Simplifies ncclGroupStart by removing library load error checking
apps/nccl/src/allreduce.hpp Splits deviceSemaphores into separate input/output semaphore vectors
apps/nccl/src/allreduce.cu Updates semaphore usage to use separate input/output semaphores and reuse output semaphores
apps/nccl/src/allgather.hpp Adds cached memory semaphores and constant for channels per connection
apps/nccl/src/allgather.cu Implements semaphore caching to avoid repeated allocations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@caiomcbr caiomcbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Binyang2014 Binyang2014 merged commit da84c6e into main Oct 3, 2025
14 checks passed
@Binyang2014 Binyang2014 deleted the binyli/nccl-fix branch October 3, 2025 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants