Skip to content

Commit fee7b94

Browse files
authored
Merge pull request #1226 from jaimergp/cuda-docs
document CUDA builds
2 parents 67f0f37 + 262583f commit fee7b94

File tree

2 files changed

+123
-0
lines changed

2 files changed

+123
-0
lines changed

src/maintainer/conda_forge_yml.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ modified. Tools like conda-smithy may modify this, as need. It has a single
6161
secure:
6262
BINSTAR_TOKEN: <some big hash>
6363
64+
.. _azure-config:
65+
6466
azure
6567
-----
6668
This dictates the behavior of the Azure Pipelines CI service. It is a

src/maintainer/knowledge_base.rst

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1136,3 +1136,124 @@ key with ``docker_image``. Also ``cdt_name`` ensures the CDTs match the CentOS
11361136
version. If this changes in the future, then this extra key may not be needed.
11371137

11381138
Finally, note that the ``aarch64`` and ``ppc64le`` platforms already use CentOS 7.
1139+
1140+
.. _cuda:
1141+
1142+
CUDA builds
1143+
===========
1144+
1145+
Although the provisioned CI machines do not feature a GPU, Conda-Forge does provide mechanisms
1146+
to build CUDA-enabled packages. These mechanisms involve several packages:
1147+
1148+
* ``cudatoolkit``: The runtime libraries for the CUDA toolkit. This is what end-users will end
1149+
up installing next to your package.
1150+
1151+
* ``nvcc``: Nvidia's EULA does not allow the redistribution of compilers and drivers. Instead, we
1152+
provide a wrapper package that locates the CUDA installation in the system. The main role of this
1153+
package is to set some environment variables (``CUDA_HOME``, ``CUDA_PATH``, ``CFLAGS`` and others),
1154+
as well as wrapping the real ``nvcc`` executable to set some extra command line arguments.
1155+
1156+
In practice, to enable CUDA on your package, add ``{{ compiler('cuda') }}`` to the ``build``
1157+
section of your requirements and rerender. The matching ``cudatoolkit`` will be added to the ``run``
1158+
requirements automatically.
1159+
1160+
On Linux, CMake users are required to use ``${CMAKE_ARGS}`` so CMake can find CUDA correctly. For example::
1161+
1162+
mkdir build && cd build
1163+
cmake ${CMAKE_ARGS} ${SRC_DIR}
1164+
make
1165+
1166+
1167+
.. note::
1168+
1169+
**How is CUDA provided at the system level?**
1170+
1171+
* On Linux, Nvidia provides official Docker images, which we then
1172+
`adapt <https://github.com/conda-forge/docker-images>`_ to Conda-Forge's needs.
1173+
1174+
* On Windows, the compilers need to be installed for every CI run. This is done through the
1175+
`conda-forge-ci-setup <https://github.com/conda-forge/conda-forge-ci-setup-feedstock/>`_ scripts.
1176+
Do note that the Nvidia executable won't install the drivers because no GPU is present in the machine.
1177+
1178+
**How is cudatoolkit selected at install time?**
1179+
1180+
Conda exposes the maximum CUDA version supported by the installed Nvidia drivers through a virtual package
1181+
named ``__cuda``. By default, ``conda`` will install the highest version available
1182+
for the packages involved. To override this behaviour, you can define a ``CONDA_OVERRIDE_CUDA`` environment
1183+
variable. More details in the
1184+
`Conda docs <https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html#overriding-detected-packages>`_.
1185+
1186+
Note that prior to v4.8.4, ``__cuda`` versions would not be part of the constraints, so you would always
1187+
get the latest one, regardless the supported CUDA version.
1188+
1189+
If for some reason you want to install a specific version, you can use::
1190+
1191+
conda install your-gpu-package cudatoolkit=10.1
1192+
1193+
Testing the packages
1194+
--------------------
1195+
1196+
Since the CI machines do not feature a GPU, you won't be able to test the built packages as part
1197+
of the conda recipe. That does not mean you can't test your package locally. To do so:
1198+
1199+
1. Enable the Azure artifacts for your feedstock (see :ref:`here <azure-config>`).
1200+
2. Include the test files and requirements in the recipe
1201+
`like this <https://github.com/conda-forge/cupy-feedstock/blob/a1e9cdf47775f90d3153a26913068c6df942d54b/recipe/meta.yaml#L51-L61>`_.
1202+
3. Provide the test instructions. Take into account that the GPU tests will fail in the CI run,
1203+
so you need to ignore them to get the package built and uploaded as an artifact.
1204+
`Example <https://github.com/conda-forge/cupy-feedstock/blob/a1e9cdf47775f90d3153a26913068c6df942d54b/recipe/run_test.py>`_.
1205+
4. Once you have downloaded the artifacts, you will be able to run::
1206+
1207+
conda build --test <pkg file>.tar.bz2
1208+
1209+
1210+
Common problems and known issues
1211+
--------------------------------
1212+
1213+
``nvcuda.dll`` cannot be found on Windows
1214+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1215+
1216+
The `scripts <https://github.com/conda-forge/conda-forge-ci-setup-feedstock/blob/master/recipe/install_cuda.bat>`_
1217+
used to install the CUDA Toolkit on Windows cannot provide ``nvcuda.dll``
1218+
as part of the installation because no GPU is physically present in the CI machines.
1219+
As a result, you might get linking errors in the postprocessing steps of ``conda build``::
1220+
1221+
WARNING (arrow-cpp,Library/bin/arrow_cuda.dll): $RPATH/nvcuda.dll not found in packages,
1222+
sysroot(s) nor the missing_dso_whitelist.
1223+
1224+
.. is this binary repackaging?
1225+
1226+
For now, you will have to add ``nvcuda.dll`` to the ``missing_dso_whitelist``::
1227+
1228+
build:
1229+
...
1230+
missing_dso_whitelist:
1231+
- "*/nvcuda.dll" # [win]
1232+
1233+
1234+
Adding support for a new CUDA version
1235+
-------------------------------------
1236+
1237+
Providing a new CUDA version involves five repositores:
1238+
1239+
* `cudatoolkit-feedstock <https://github.com/conda-forge/cudatoolkit-feedstock>`_
1240+
* `nvcc-feedstock <https://github.com/conda-forge/nvcc-feedstock>`_
1241+
* `conda-forge-pinning-feedstock <https://github.com/conda-forge/conda-forge-pinning-feedstock>`_
1242+
* `docker-images <https://github.com/conda-forge/docker-images>`_ (Linux only)
1243+
* `conda-forge-ci-setup-feedstock <https://github.com/conda-forge/conda-forge-ci-setup-feedstock>`_ (Windows only)
1244+
1245+
The steps involved are, roughly:
1246+
1247+
1. Add the ``cudatoolkit`` packages in ``cudatoolkit-feedstock``.
1248+
2. Submit the version migrator to ``conda-forge-pinning-feedstock``.
1249+
This will stay open during the following steps.
1250+
3. For Linux, add the corresponding Docker images at ``docker-images``.
1251+
Copy the migration file manually to ``.ci_support/migrations``.
1252+
This copy should not specify a timestamp. Comment it out and rerender.
1253+
4. For Windows, add the installer URLs and hashes to the ``conda-forge-ci-setup``
1254+
`script <https://github.com/conda-forge/conda-forge-ci-setup-feedstock/blob/master/recipe/install_cuda.bat>`_.
1255+
The migration file must also be manually copied here. Rerender.
1256+
5. Create the new ``nvcc`` packages for the new version. Again, manual
1257+
migration must be added. Rerender.
1258+
6. When everything else has been merged and testing has taken place,
1259+
consider merging the PR opened at step 2 now so it can apply to all the downstream feedstocks.

0 commit comments

Comments
 (0)