Contributing to NCCL

Thank you for your interest in contributing to NCCL! We appreciate the time and effort you're putting into helping improve the library. This document will help guide you through the contribution process.

Before You Start

To help ensure your contribution can be accepted smoothly:

Open an issue first for significant changes or new features. This allows us to discuss the approach before you invest significant time.
Check existing issues to see if someone else is already working on something similar.
Ask questions if you're unsure about anything. We're happy to help!

Getting Started

Fork the Repository: Fork https://github.com/NVIDIA/nccl and clone your fork locally.
Create a Branch: Create a feature branch for your work.
```
git checkout -b <my-feature-branch>
```

Commit your Changes: Commit and push into feature branch

git add <files-to-commit>
git commit --message <descriptive message, see below>
git push origin <my-feature-branch> --set-upstream

What We're Looking For

We welcome contributions in the following areas:

Bug fixes: Reproducible bugs with clear fixes are always appreciated
Performance improvements: Targeted optimizations, guarded by appropriate checks
Platform support: Extending NCCL to new platforms, network fabrics, or PCI bridges
Documentation: Improvements to README, comments, or usage examples

Making Your Contribution Successful

To help us review and integrate your contribution efficiently:

Scope and Focus

Keep changes focused on a single issue or feature
Break large changes into smaller, reviewable pieces when possible
Avoid mixing unrelated changes (e.g., bug fix + formatting changes) in one PR

Quality Standards

Ensure your code compiles without warnings
Test thoroughly on relevant hardware configurations
Performance claims should be backed by measurements
Public API changes require discussion (see below)

Common Challenges

Some types of contributions require extra discussion before we can accept them:

Public API changes: NCCL's API stability is important to users. If your contribution changes public APIs, please open an issue to discuss the design first. We need to ensure backward compatibility and consistency.
Architecture-specific code: NCCL supports specific GPU architectures and network configurations. Contributions targeting unsupported architectures may not be accepted unless there's a clear plan for ongoing maintenance.
Workarounds for external issues: If a change works around a problem in another component (drivers, libraries, frameworks), we may need to address it differently. Let's discuss the root cause first.
Incomplete implementations: Partial features or fixes that don't fully address the problem are difficult to maintain. If you need help completing an implementation, let us know - we're happy to assist!

New Features

If you're proposing a substantial new feature (e.g., new collective operations, transport mechanisms, algorithms, or significant architectural changes), we follow a more structured process:

Open an issue describing the feature and use case
Discussion phase: We'll discuss whether it fits NCCL's direction
Document Design: Document the proposed architecture, a testing and validation plan as well as any known performance impact.
Review: The design will be reviewed by NCCL maintainers
Implementation: We recommend that you wait with the implementation until the review has concluded. Once there's a mutual agreement on the approach, proceed with implementation
Pull request: Submit your PR referencing the original issue and design doc

Code Style

NCCL follows these coding conventions:

General Guidelines

Use 2 spaces for indentation (no tabs)
Maximum line length: 100 characters
Follow K&R brace style for C/CUDA code
Use clear, descriptive variable names

Naming Conventions

Functions and variables: ncclCamelCase
Macros and constants: UPPER_CASE_WITH_UNDERSCORES
Struct/type names: ncclFooBar

CUDA-Specific

Minimize register usage in performance-critical kernels
Avoid warp divergence where possible
Document kernel launch configurations and occupancy considerations
Prefer cudaLaunchKernel over <<<>>> syntax

Return Codes

NCCL functions should return a ncclResult_t
All function calls should be guarded by NCCLCHECK or similar macro
All external function calls should be guarded with CUDACHECK, SYSCHECK, PTHREADCHECK, etc.

Memory Management

Always allocate memory through NCCL alloc functions, e.g. ncclCalloc()
Use appropriate memory fences for synchronization
Free resources in reverse order of allocation

Comments and Documentation

Write clear comments for complex algorithms
Document assumptions and invariants
Explain "why" not just "what" for non-obvious code

Code Formatting

Avoid trailing white-spaces
Try to match the existing style in files you're modifying

Pull Request Guidelines

Before Submitting

Rebase your branch on the latest master branch
Run make to ensure clean build with no warnings

Commit Messages

Use imperative mood: "Add feature" not "Added feature"
First line: brief summary (50 chars or less)
Second line blank
Detailed explanation including the Problem, Solution, and any Limitations
Reference issue numbers: "Fixes #123"

An example commit message is:

Comment          | Commit message
-----------------|--------------------------------------------------------
Title            | Fix crash in proxy on systems with more than 2 GPUs
Blank line       | 
Problem          | When a system has more than 2 GPUs, the table we use to
                 | store addresses overflows.
Solution         | This fix increases the size of the table to the maximum
                 | number of GPUs we can have within a node.
Limitations and  | We may want to make the table size dynamic to save
Caveats          | memory, but since it is a part of a struct, it is easier
                 | for now to keep the code simple.

Signed Commits

All commits must be signed off to certify you have the right to submit the code:

git commit -s -m "Your commit message"

This adds: Signed-off-by: Your Name <your@email.com>

This certifies your agreement with the Developer Certificate of Origin (DCO).

Code Review Process

After you submit a PR:

Maintainer review: NCCL engineers will review your code
Discussion: We may ask questions or request changes
Iteration: Address feedback and update your PR
Approval: Once approved, we'll merge your contribution

Please be patient during review. We aim to provide initial feedback within a week. Feel free to ping us if we take much longer than that.

Communication

GitHub Issues: For bug reports and feature requests
Pull Requests: For code contributions and discussion

Getting Help

If you need help at any stage:

Comment on your issue or PR
Ask questions in GitHub Issues
Refer to NCCL documentation and examples

We're committed to helping contributors succeed!

Thank You

We truly appreciate your time and effort in contributing to NCCL.

Happy coding!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to NCCL

Before You Start

Getting Started

What We're Looking For

Making Your Contribution Successful

Scope and Focus

Quality Standards

Common Challenges

New Features

Code Style

General Guidelines

Naming Conventions

CUDA-Specific

Return Codes

Memory Management

Comments and Documentation

Code Formatting

Pull Request Guidelines

Before Submitting

Commit Messages

Signed Commits

Code Review Process

Communication

Getting Help

Thank You

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to NCCL

Before You Start

Getting Started

What We're Looking For

Making Your Contribution Successful

Scope and Focus

Quality Standards

Common Challenges

New Features

Code Style

General Guidelines

Naming Conventions

CUDA-Specific

Return Codes

Memory Management

Comments and Documentation

Code Formatting

Pull Request Guidelines

Before Submitting

Commit Messages

Signed Commits

Code Review Process

Communication

Getting Help

Thank You