Skip to content

Latest commit

 

History

History
42 lines (31 loc) · 1.41 KB

File metadata and controls

42 lines (31 loc) · 1.41 KB

CCCL Python Libraries

Overview

The CUDA Core Compute Libraries (CCCL) for Python are a collection of modules with the shared goal of providing high-quality, high-performance, and easy-to-use abstractions for CUDA Python developers.

  • :doc:`cuda.compute <compute>` — Composable device-level primitives for building custom parallel algorithms, without writing CUDA kernels directly.
  • :doc:`cuda.coop <coop>` — Cooperative block- and warp-level algorithms for writing highly efficient CUDA kernels with Numba CUDA.
  • :doc:`cuda.stf <stf>` — Sequential Task Flow for CUDA: define logical data and tasks with read/write annotations; STF orchestrates execution and data movement.

These libraries expose the generic, highly-optimized algorithms from the CCCL C++ libraries, which have been tuned to provide optimal performance across GPU architectures.

Who is this for?

  • Library authors building parallel algorithms that need portable performance across GPU architectures—without dropping to CUDA C++.
  • Application developers using PyTorch, CuPy, or other GPU-accelerated frameworks who need custom algorithms beyond what those libraries provide.
.. toctree::
   :maxdepth: 2
   :caption: CCCL Python Libraries

   setup
   compute
   coop
   stf
   resources
   api_reference