triton/: GPU and GRES minor updates

rkdarst · rkdarst · commit 4a2142735e28 · 2025-09-18T21:01:28.000+03:00
diff --git a/triton/ref/gpu.rst b/triton/ref/gpu.rst
@@ -1,24 +1,28 @@
 .. csv-table::
    :delim: |
    :header-rows: 1
+   :class: scicomp-table-dense
 
-   GPU brand name       | GPU name in Slurm (``--gpus=NAME:n``)  | Amount of VRAM                   | CUDA compute capability         | total amount   | nodes            | GPUs per node | Compute threads per GPU   | Slurm partition (``--partition=``)               |
-   NVIDIA H200(*)       | ``h200``                               | 141GB (``--gres=gpu-vram:141g``) | 9.0 (``--gres=min-cuda-cc=90``) | 112            | gpu[50-63]       | 8             | 16896                     | ``gpu-h200-141g-ellis``, ``gpu-h200-141g-short`` |
-   NVIDIA H200(**)      | ``h200_2g.35gb``                       | 35GB  (``--gres=gpu-vram:35g``)  | 9.0 (``--gres=min-cuda-cc=90``) | 24             | gpu[49]          | 24            | 4224                      | ``gpu-h200-35g-ia-ellis``, ``gpu-h200-35g-ia``   |
-   NVIDIA H100          | ``h100``                               | 80GB  (``--gres=gpu-vram:80g``)  | 9.0 (``--gres=min-cuda-cc=90``) | 16             | gpu[45-48]       | 4             | 16896                     | ``gpu-h100-80g``                                 |
-   NVIDIA A100          | ``a100``                               | 80GB  (``--gres=gpu-vram:80g``)  | 8.0 (``--gres=min-cuda-cc=80``) | 56             | gpu[11-17,38-44] | 4             | 7936                      | ``gpu-a100-80g``                                 |
-   NVIDIA V100          | ``v100``                               | 32GB  (``--gres=gpu-vram:32g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 40             | gpu[28-37]       | 4             | 5120                      | ``gpu-v100-32g``                                 |
-   NVIDIA V100          | ``v100``                               | 32GB  (``--gres=gpu-vram:32g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 40             | gpu[1-10]        | 4             | 5120                      | ``gpu-v100-32g``                                 |
-   NVIDIA V100          | ``v100``                               | 32GB  (``--gres=gpu-vram:32g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 32             | dgx[3,5-7]       | 8             | 5120                      | ``gpu-v100-32g``                                 |
-   NVIDIA V100          | ``v100``                               | 16GB  (``--gres=gpu-vram:16g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 176            | dgx[1-2,8-27]    | 8             | 5120                      | ``gpu-v100-16g``                                 |
-   AMD MI210            | ``mi210`` with  ``-p gpu-amd``         | 32GB                             |                                 | 2              | gpuamd[1]        | 2             | 7680                      | ``gpu-amd``                                      |
-   AMD MI100            | ``mi100`` with  ``-p gpu-amd``         | 64GB                             |                                 | 1              | gpuamd[1]        | 1             | 6656                      | ``gpu-amd``                                      |
+   GPU brand name       | GPU name in Slurm (``--gpus=NAME:n``)  | VRAM GB (``--gres=gpu-vram:NNg``) | CUDA compute capability (``--gres=min-cuda-cc=NN``) | total amount   | nodes            | GPUs per node | Compute threads per GPU   | Slurm partition (``--partition=``)               |
+   NVIDIA H200(*)       | ``h200``                               | ``141``                           | 9.0 (``90``)                                        | 112            | gpu[50-63]       | 8             | 16896                     | ``gpu-h200-141g-ellis``, ``gpu-h200-141g-short`` |
+   NVIDIA H200(**)      | ``h200_2g.35gb``                       | ``35``                            | 9.0 (``90``)                                        | 24             | gpu[49]          | 24            | 4224                      | ``gpu-h200-35g-ia-ellis``, ``gpu-h200-35g-ia``   |
+   NVIDIA H100          | ``h100``                               | ``80``                            | 9.0 (``90``)                                        | 16             | gpu[45-48]       | 4             | 16896                     | ``gpu-h100-80g``                                 |
+   NVIDIA A100          | ``a100``                               | ``80``                            | 8.0 (``80``)                                        | 56             | gpu[11-17,38-44] | 4             | 7936                      | ``gpu-a100-80g``                                 |
+   NVIDIA V100          | ``v100``                               | ``32``                            | 7.0 (``70``)                                        | 40             | gpu[28-37]       | 4             | 5120                      | ``gpu-v100-32g``                                 |
+   NVIDIA V100          | ``v100``                               | ``32``                            | 7.0 (``70``)                                        | 40             | gpu[1-10]        | 4             | 5120                      | ``gpu-v100-32g``                                 |
+   NVIDIA V100          | ``v100``                               | ``32``                            | 7.0 (``70``)                                        | 32             | dgx[3,5-7]       | 8             | 5120                      | ``gpu-v100-32g``                                 |
+   NVIDIA V100          | ``v100``                               | ``16``                            | 7.0 (``70``)                                        | 176            | dgx[1-2,8-27]    | 8             | 5120                      | ``gpu-v100-16g``                                 |
+   AMD MI210            | ``mi210`` with  ``-p gpu-amd``         | ``32``                            |                                                     | 2              | gpuamd[1]        | 2             | 7680                      | ``gpu-amd``                                      |
+   AMD MI100            | ``mi100`` with  ``-p gpu-amd``         | ``64``                            |                                                     | 1              | gpuamd[1]        | 1             | 6656                      | ``gpu-amd``                                      |
 
-To request multiple gres, e.g. both 32GB of memory and compute capability 8.0, use a comma separated list: ``--gres=gpu-vram:32g,min-cuda-cc=80``.
+Since 2025, the main way to request certain types of GPUs is with
+``--gres``, for example ``--gpus=1 --gres=min-vram:32``.  Only one
+``--gres`` option is allowed, so to combine gres, use a comma
+separated list: ``--gres=gpu-vram:32g,min-cuda-cc=80``.
 
 (*) These GPUs have a priority queue for the Ellis project, since they were
 procured for this project. Any job submitted to the short queue might be
 preempted if a job requiring the resources comes in from the Ellis queue.
 
-(**) These GPUs are split from a single GPU with NVIDIA's 
+(**) These GPUs are split from a single GPU with NVIDIA's
 `Multi-Instance GPU <https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html>`__-feature.
diff --git a/triton/ref/hardware.rst b/triton/ref/hardware.rst
@@ -1,6 +1,7 @@
 .. csv-table::
    :delim: |
    :header-rows: 1
+   :class: scicomp-table-dense
 
    Node name        | Number of nodes   | Node type              | Year    | Arch (``--constraint``)   | CPU type                                                                                                                                             | Memory Configuration   | Infiniband | GPUs | Disks
    pe[1-48,65-81]   | 65                | Dell PowerEdge C4130   | 2016    | hsw avx2              | 2x12 core `Xeon E5 2680 v3 <https://ark.intel.com/products/81908/Intel-Xeon-Processor-E5-2680-v3-30M-Cache-2_50-GHz>`__ 2.50GHz                      | 128GB DDR4-2133        | FDR |   | 900GB HDD
diff --git a/triton/ref/slurm.rst b/triton/ref/slurm.rst
@@ -17,24 +17,26 @@
    :header-rows: 1
    :delim: !
 
-   Command                  ! Option                          ! Description
-   ``sbatch``/``srun``/etc  ! ``-t``, ``--time=``\ *HH:MM:SS* ! **time limit**
-                            ! ``-t, --time=``\ *DD-HH*        ! **time limit, days-hours**
-                            ! ``-p, --partition=``\ *PARTITION*! **job partition.  Usually leave off and things are auto-detected.**
-                            ! ``--mem-per-cpu=``\ *N*         ! **request n MB of memory per core**
-                            ! ``--mem=``\ *N*                 ! **request n MB memory per node**
-                            ! ``-c``, ``--cpus-per-task=``\ *N*  ! **Allocate *n* CPU's for each task. For multithreaded jobs. (compare ``--ntasks``: ``-c`` means the number of cores for each process started.)**
-                            ! ``-N``, ``--nodes=``\ *N-M*        ! allocate minimum of n, maximum of m nodes.
-                            ! ``-n``, ``--ntasks=``\ *N*         ! allocate resources for and start *n* tasks (one task=one process started, it is up to you to make them communicate. However the main script runs only on first node, the sub-processes run with "srun" are run this many times.)
-                            ! ``-J``, ``--job-name=``\ *NAME*    ! short job name
-                            ! ``-o`` *OUTPUTFILE*            ! print output into file *output*
-                            ! ``-e`` *ERRORFILE*             ! print errors into file *error*
+   Command                  ! Option                         ! Description
+   ``sbatch``/``srun``/etc  ! ``-t``, ``--time=HH:MM:SS``    ! **time limit**
+                            ! ``-t``, ``--time=DD-HH``           ! **time limit, days-hours**
+                            ! ``-p PARTITION``, ``--partition=PARTITION``  ! **job partition.  Usually leave off and things are auto-detected.**
+                            ! ``--mem-per-cpu=N``            ! **request N MB of memory per core**
+                            ! ``--mem=N``                    ! **request N MB memory per node**
+                            ! ``-c``, ``--cpus-per-task=N``  ! **Allocate *n* CPU's for each task. For multithreaded jobs. (compare ``--ntasks``: ``-c`` means the number of cores for each process started.)**
+                            ! ``-N``, ``--nodes=N-M``        ! allocate minimum of N, maximum of M nodes.
+                            ! ``-n``, ``--ntasks=N``         ! allocate resources for and start *n* tasks (one task=one process started, it is up to you to make them communicate. However the main script runs only on first node, the sub-processes run with "srun" are run this many times.)
+                            ! ``--gpus=1``                   ! request a GPU, or ``--gpus=N`` for multiple
+                            ! ``--gres=min-vram:NNg``        ! request GPUs with at least ``NN`` GB of VRAM.  To combine with other ``--gres`` options, use ``--gres=min-vram:NNg,min-cuda-cc=NN``.
+                            ! ``--gres=min-cuda-cc:NN``      ! request GPUs with CUDA compute capability of at least N.N.  See above for combining with other GRES.
+                            ! ``-J``, ``--job-name=NAME``    ! short job name
+                            ! ``-o OUTPUTFILE``              ! print output into file *output*
+                            ! ``-e ERRORFILE``               ! print errors into file *error*
                             ! ``--exclusive``                ! allocate exclusive access to nodes.  For large parallel jobs.
-                            ! ``--constraint=``\ *FEATURE*   ! request *feature* (see ``slurm features`` for the current list of configured features, or Arch under the :ref:`hardware list <hardware-list>`).  Multiple with ``--constraint="hsw|skl"``.
+                            ! ``--constraint=FEATURE``       ! request *feature* (see ``slurm features`` for the current list of configured features, or Arch under the :ref:`hardware list <hardware-list>`).  Multiple with ``--constraint="hsw|skl"``.
                             ! ``--constraint=localdisk``     ! request nodes that have local disks
 			    ! ``--tmp=nnnG``                 ! Request ``nnn`` GB of :doc:`local disk storage space </triton/usage/localstorage>`
-                            ! ``--array=``\ *0-5,7,10-15*    ! Run job multiple times, use variable ``$SLURM_ARRAY_TASK_ID`` to adjust parameters.
-                            ! ``--gpus=1``                   ! request a GPU, or ``--gpus=N`` for multiple
-                            ! ``--mail-type=``\ *TYPE*       ! notify of events: ``BEGIN``, ``END``, ``FAIL``, ``ALL``, ``REQUEUE`` (not on triton) or ``ALL.`` MUST BE used with ``--mail-user=`` only
-                            ! ``--mail-user=``\ *first.last@aalto.fi* ! Aalto email to send the notification about the job. External email addresses doesn't work.
-   ``srun``                 ! ``-N`` *N_NODES* hostname    ! Print allocated nodes (from within script)
+                            ! ``--array=0-5,7,10-15`    `    ! Run job multiple times, use variable ``$SLURM_ARRAY_TASK_ID`` to adjust parameters.
+                            ! ``--mail-type=TYPE``           ! notify of events: ``BEGIN``, ``END``, ``FAIL``, ``ALL``, ``REQUEUE`` (not on triton) or ``ALL.`` MUST BE used with ``--mail-user=`` only
+                            ! ``--mail-user=first.last@aalto.fi`` ! Aalto email to send the notification about the job. External email addresses doesn't work.
+   ``srun``                 ! ``-N N_NODES hostname``        ! Print allocated nodes (from within script)
diff --git a/triton/tut/gpu.rst b/triton/tut/gpu.rst
@@ -13,7 +13,8 @@ GPU computing
    * Select a GPU with certain CUDA compute capability with e.g.
      ``--gpus=1 --gres=min-cuda-cc:80``.
    * See :ref:`the quick reference <available-gpus>` for available GPU names,
-     memory capacities and compute capabilities.
+     memory capacities, and compute capabilities, and how to combine
+     ``--gres`` options.
    * Monitor GPU performance with ``seff JOBID``.
    * You can test out small jobs of 30 minutes or less in the
      ``gpu-debug``-partition (``--partition=gpu-debug``).
@@ -150,6 +151,9 @@ with ``--gpus=N`` as well.
 For example, specifying ``--gpus=1`` and ``--gres=min-cuda-cc:80`` would give
 you a single GPU with minimum compute capabilty support of 8.0.
 
+Only one ``--gres`` option can be given, so combine them with a comma
+like ``--gres=min-vram:40g,min-cuda-cc:80``.
+
 See the :ref:`available GPUs reference <available-gpus>` for more information on
 available GPUs.