Skip to content

Conversation

@sbhavani
Copy link
Contributor

@sbhavani sbhavani commented Jan 9, 2026

Description

The NCCL package from PyPI (e.g., nvidia-nccl-cu12/cu13) is a PEP 420 namespace package that doesn't have __file__ attribute. This caused get_cuda_include_dirs() to fail with TypeError when building with only NCCL from PyPI.

Use nvidia.__path__[0] which works for both namespace packages and regular packages.

Fixes: #2331

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Fixed get_cuda_include_dirs() to handle namespace packages (which lack __file__)
  • Added clear error message when nvidia package directory cannot be located

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@sbhavani sbhavani force-pushed the fix/nccl-pypi-detection branch from 0f4e496 to 7237bea Compare January 9, 2026 00:51
The nvidia package from PyPI (e.g., nvidia-nccl-cu12/cu13) is a
PEP 420 namespace package that doesn't have __file__ attribute.
This caused get_cuda_include_dirs() to fail with TypeError when
building with only NCCL from PyPI.

Use nvidia.__path__[0] which works for both namespace packages
and regular packages.

Fixes: NVIDIA#2331
Signed-off-by: Santosh Bhavani <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes a bug where get_cuda_include_dirs() would crash when the nvidia package is a PEP 420 namespace package (which doesn't have a __file__ attribute). This occurs when users install NCCL and other CUDA packages from PyPI.

Key Changes:

  • Replaced Path(nvidia.__file__).parent with Path(list(nvidia.__path__)[0])
  • Added proper error handling when the nvidia package directory cannot be located
  • Solution works for both regular packages (with __init__.py) and namespace packages (without __init__.py)

How it works:

  • All Python packages (both regular and namespace) have a __path__ attribute containing their directory locations
  • Regular packages also have __file__, but namespace packages do not
  • By using __path__[0], the code handles both package types uniformly

The fix is minimal, targeted, and maintains backward compatibility with existing installations.

Confidence Score: 5/5

  • Safe to merge - fixes a real bug without introducing new issues
  • The change correctly handles both namespace packages (PEP 420) and regular packages by using path[0] instead of file.parent. The truthiness check on path prevents IndexError from empty lists. The logic is sound and maintains backward compatibility.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
build_tools/utils.py 4/5 Fixed namespace package handling by using path[0] instead of file.parent - handles both PEP 420 namespace packages and regular packages correctly

Sequence Diagram

sequenceDiagram
    participant Builder as Build System
    participant Utils as get_cuda_include_dirs()
    participant Python as Python Import System
    participant FS as File System

    Builder->>Utils: Call to get CUDA headers
    Utils->>Utils: Check cuda_toolkit_include_path()
    alt CUDA toolkit found
        Utils-->>Builder: Return toolkit include paths
    else No toolkit, use PyPI packages
        Utils->>Python: import nvidia
        alt Import fails
            Python-->>Utils: ModuleNotFoundError
            Utils-->>Builder: RuntimeError("CUDA not found")
        else Import succeeds
            Python-->>Utils: nvidia module
            Utils->>Utils: Check hasattr(nvidia, "__path__")
            alt Has __path__ and not empty
                Utils->>Python: Access nvidia.__path__[0]
                Python-->>Utils: First path in __path__
                Utils->>Utils: cuda_root = Path(__path__[0])
                Utils->>FS: cuda_root.iterdir()
                FS-->>Utils: List of subdirectories
                Utils->>Utils: Filter for dirs with /include
                Utils-->>Builder: Return list of include paths
            else No __path__ or empty
                Utils-->>Builder: RuntimeError("Could not locate nvidia package directory")
            end
        end
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes a TypeError when building with PyPI CUDA packages (like nvidia-nccl-cu12/cu13) by handling PEP 420 namespace packages that lack the __file__ attribute. The fix correctly switches from using nvidia.__file__ to nvidia.__path__[0] in get_cuda_include_dirs().

Key Changes

  • Modified get_cuda_include_dirs() in build_tools/utils.py to use nvidia.__path__[0] instead of nvidia.__file__
  • Added proper attribute checking with hasattr(nvidia, "__path__") before accessing
  • Added clear error message when nvidia package directory cannot be located

Issues Found

  • Incomplete fix: transformer_engine/common/__init__.py at lines 249-252 has the same nvidia.__file__ pattern that will fail with namespace packages. Line 249 checks if nvidia.__file__ is None, but namespace packages raise AttributeError (attribute doesn't exist), not return None. Consider extending this fix to that location.
  • Minor optimization: list(nvidia.__path__)[0] can be simplified to nvidia.__path__[0] since __path__ is already indexable.

Confidence Score: 3/5

  • This PR is safe to merge with moderate confidence - it fixes the stated issue but leaves similar issues unaddressed
  • The fix is technically correct and solves the immediate problem with namespace packages in get_cuda_include_dirs(). However, the same pattern exists in transformer_engine/common/__init__.py which will still fail with namespace packages. Score of 3 reflects that this is a partial fix - the change works but doesn't address all instances of the problem in the codebase.
  • Consider reviewing transformer_engine/common/__init__.py lines 249-252 for the same namespace package issue

Important Files Changed

File Analysis

Filename Score Overview
build_tools/utils.py 3/5 Fixed get_cuda_include_dirs() to handle PEP 420 namespace packages using __path__, but missed fixing similar issue in transformer_engine/common/__init__.py and has minor code optimization opportunity

Sequence Diagram

sequenceDiagram
    participant Build as Build System
    participant Utils as build_tools/utils.py
    participant Nvidia as nvidia package
    participant FS as File System

    Build->>Utils: get_cuda_include_dirs()
    
    alt CUDA Toolkit Installed
        Utils->>Utils: cuda_toolkit_include_path()
        Utils-->>Build: [toolkit_include_path]
    else PyPI CUDA Packages
        Utils->>Nvidia: import nvidia
        alt ModuleNotFoundError
            Nvidia-->>Utils: Error
            Utils-->>Build: RuntimeError("CUDA not found.")
        else Success
            Nvidia-->>Utils: nvidia module
            
            alt Has __path__ attribute (NEW)
                Utils->>Nvidia: check hasattr(__path__) && __path__
                alt __path__ exists and not empty
                    Utils->>Nvidia: nvidia.__path__[0]
                    Nvidia-->>Utils: First package path
                    Utils->>FS: iterate cuda_root subdirs
                    FS-->>Utils: subdirectories
                    Utils->>Utils: filter dirs with include/
                    Utils-->>Build: [subdir/include, ...]
                else No __path__ or empty
                    Utils-->>Build: RuntimeError("Could not locate nvidia package directory.")
                end
            end
        end
    end
Loading

# Handle namespace packages (PEP 420) which don't have __file__
# The nvidia package from PyPI CUDA packages is a namespace package
if hasattr(nvidia, "__path__") and nvidia.__path__:
cuda_root = Path(list(nvidia.__path__)[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant list() conversion - nvidia.__path__ is already indexable, can use nvidia.__path__[0] directly

Suggested change
cuda_root = Path(list(nvidia.__path__)[0])
cuda_root = Path(nvidia.__path__[0])

Comment on lines +242 to +247
# Handle namespace packages (PEP 420) which don't have __file__
# The nvidia package from PyPI CUDA packages is a namespace package
if hasattr(nvidia, "__path__") and nvidia.__path__:
cuda_root = Path(list(nvidia.__path__)[0])
else:
raise RuntimeError("Could not locate nvidia package directory.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix correctly handles namespace packages here, but transformer_engine/common/__init__.py line 249-252 has the same issue with nvidia.__file__:

  • Line 249 checks if nvidia.__file__ is None, but namespace packages will raise AttributeError when accessing __file__ (the attribute doesn't exist, it doesn't return None)
  • Line 252 uses Path(nvidia.__file__).parent which will also fail with namespace packages

Consider applying the same fix pattern (__path__) to _nvidia_cudart_include_dir() in that file. Should this PR also fix the similar nvidia.__file__ usage in transformer_engine/common/__init__.py line 249-252, or will that be addressed in a separate PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build Improvements - Find NCCL.h automatically from pypi nvidia-nccl-cu12/cu13

1 participant