Merge branch 'master' into gpu-cuda-compute

sjjamsa · web-flow · commit 1ade8674928b · 2025-09-11T07:53:49.000+03:00
diff --git a/help/garage.rst b/help/garage.rst
@@ -6,14 +6,6 @@ Scicomp garage
 
    https://aalto.zoom.us/j/61322268370, every workday at 13:00
 
-.. admonition:: Planned disruptions
-   :class: important
-
-   * 7 -18 July there are very few people around.  It's better to ask
-     in chat.  Otherwise, there are usually a few people around
-     but we don't make promises for any given day.  Try to join and
-     see (and try to help each other in chat, too).
-
 
 If you need more help than the issue trackers, this is the place to
 be.  It's not just Triton, but all aspects of scientific computing.
diff --git a/news/scc26-team-call.md b/news/scc26-team-call.md
@@ -27,7 +27,7 @@ We are seeking team players with the following qualifications:
 
 Candidates must demonstrate motivation, eagerness to learn, and a commitment to contributing to the team. CSC, alongside Aalto and HY, will offer comprehensive training and coaching.
 
-We value diversity and inclusiveness and strongly encourage applications from women and other underrepresented groups in the field. Apply with a cover letter (explaining why you should be selected) and a CV by Sunday 17.9.2025.
+We value diversity and inclusiveness and strongly encourage applications from women and other underrepresented groups in the field. Apply with a cover letter (explaining why you should be selected) and a CV by Wed 17.9.2025.
 
 *Please send your cover letter and CV together in a single email with the subject line **'SCC26 Team Application'** to either Aaron (HY) or Ivan (Aalto), see contacts below. In the cover letter please tell us the following:*
 
diff --git a/triton/accounts.rst b/triton/accounts.rst
@@ -148,7 +148,14 @@ data are unrecoverable after deleting, which will happen eventually.
 If data is stored in a group directory (/scratch/$dept/$groupname), it
 won't be deleted and will stay managed by the group owner.
 
+Reactivating an expired/deactivated account
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
+If your account has expired in between contracts or for some other
+reason, you can request it to be reactivated by using the regular
+`account request form
+<https://selfservice.esupport.aalto.fi/ssc/app#/order/2025/>`__
+and mentioning that you've had an account before.
 
 Terms of use/privacy policy
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/triton/apps/comsol.rst b/triton/apps/comsol.rst
@@ -5,7 +5,7 @@ COMSOL Multiphysics
 .. _comsol:
 
 
-.. hint:: We are continuing COMSOL focus days in our :ref:`daily zoom garage<scicomp-garage>` in Summer 2025: someone from COMSOL (the company) plans to join our zoom garage at 13:00 on the following Wednesday: 2025-08-13. The continuation of the focus days depends on the popularity of the sesssions. There will be an open (free) `Comsol course at Aalto on 2025-09-09 <https://www.comsol.com/c/h04p>`_. We also plan to have a triton specific session on 2025-09-17.
+.. hint:: We are continuing COMSOL focus days in our :ref:`daily zoom garage<scicomp-garage>` in Summer 2025: someone from COMSOL (the company) plans to join our zoom garage at 13:00 on the following Wednesday: 2025-08-13. The continuation of the focus days depends on the popularity of the sesssions. We will organize a `triton specific tutorial on 2025-09-17 <https://www.aalto.fi/en/events/tutorial-running-comsol-on-the-triton-cluster>`_.
 
 .. hint:: Join the other COMSOL users in our Zulip :ref:`chat`: Stream "#triton", topic "Comsol user group".
 
diff --git a/triton/apps/llms.rst b/triton/apps/llms.rst
@@ -50,17 +50,12 @@ In the following sbatch script, we request computational resources, load the nec
     #SBATCH --output huggingface.%J.out
     #SBATCH --error huggingface.%J.err
 
-    #By loading the model-huggingface module, we set HF_HOME to /scratch/shareddata/dldata/huggingface-hub-cache which is a shared scratch space.
-    #By default, HF_HOME is set to $HOME/.cache/huggingface, which is under your own home directory where you have limited quota.
+    #By loading the model-huggingface module, models will be loaded from /scratch/shareddata/dldata/huggingface-hub-cache which is a shared scratch space.
     module load model-huggingface
 
     # Load a ready to use conda environment to use HuggingFace Transformers
     module load scicomp-llm-env
 
-    # Force transformer to load model(s) from local hub instead of download and load model(s) from remote hub. 
-    export TRANSFORMERS_OFFLINE=1
-    export HF_HUB_OFFLINE=1
-
     python your_script.py
 
 The ``your_script.py`` Python script uses a HuggingFace model ``mistralai/Mistral-7B-Instruct-v0.1`` for conversations and instructions.
@@ -74,7 +69,7 @@ The ``your_script.py`` Python script uses a HuggingFace model ``mistralai/Mistra
   pipe = pipeline( 
     "text-generation", # Task type 
     model="mistralai/Mistral-7B-Instruct-v0.1", # Model name 
-    device="auto", # Let the pipeline automatically select best available device
+    device_map="auto", # Let the pipeline automatically select best available device
     max_new_tokens=1000 
   ) 
 
diff --git a/triton/ref/gpu.rst b/triton/ref/gpu.rst
@@ -2,7 +2,7 @@
    :delim: |
    :header-rows: 1
 
-   GPU brand name       | GPU name in Slurm (``--gpus=NAME:n``) | Amount of VRAM                   | CUDA compute capability         | total amount   | nodes            | GPUs per node | Compute threads per GPU   | Slurm partition (``--partition=``)               |
+   GPU brand name       | GPU name in Slurm (``--gpus=NAME:n``)  | Amount of VRAM                   | CUDA compute capability         | total amount   | nodes            | GPUs per node | Compute threads per GPU   | Slurm partition (``--partition=``)               |
    NVIDIA H200(*)       | ``h200``                               | 141GB (``--gres=gpu-vram:141g``) | 9.0 (``--gres=min-cuda-cc=90``) | 112            | gpu[50-63]       | 8             | 16896                     | ``gpu-h200-141g-ellis``, ``gpu-h200-141g-short`` |
    NVIDIA H200(**)      | ``h200_2g.35gb``                       | 35GB  (``--gres=gpu-vram:35g``)  | 9.0 (``--gres=min-cuda-cc=90``) | 24             | gpu[49]          | 24            | 4224                      | ``gpu-h200-35g-ia-ellis``, ``gpu-h200-35g-ia``   |
    NVIDIA H100          | ``h100``                               | 80GB  (``--gres=gpu-vram:80g``)  | 9.0 (``--gres=min-cuda-cc=90``) | 16             | gpu[45-48]       | 4             | 16896                     | ``gpu-h100-80g``                                 |
@@ -11,7 +11,8 @@
    NVIDIA V100          | ``v100``                               | 32GB  (``--gres=gpu-vram:32g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 40             | gpu[1-10]        | 4             | 5120                      | ``gpu-v100-32g``                                 |
    NVIDIA V100          | ``v100``                               | 32GB  (``--gres=gpu-vram:32g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 32             | dgx[3,5-7]       | 8             | 5120                      | ``gpu-v100-32g``                                 |
    NVIDIA V100          | ``v100``                               | 16GB  (``--gres=gpu-vram:16g``)  | 7.0 (``--gres=min-cuda-cc=70``) | 176            | dgx[1-2,8-27]    | 8             | 5120                      | ``gpu-v100-16g``                                 |
-   AMD MI100 (testing)  | Use ``-p gpu-amd`` only, no ``--gpus`` | 32GB                             |                                 | 3              | gpuamd[1]        | 3             | 7680                      | ``gpu-amd``                                      |
+   AMD MI210            | ``mi210`` with  ``-p gpu-amd``         | 32GB                             |                                 | 2              | gpuamd[1]        | 2             | 7680                      | ``gpu-amd``                                      |
+   AMD MI100            | ``mi100`` with  ``-p gpu-amd``         | 64GB                             |                                 | 1              | gpuamd[1]        | 1             | 6656                      | ``gpu-amd``                                      |
 
 (*) These GPUs have a priority queue for the Ellis project, since they were
 procured for this project. Any job submitted to the short queue might be
diff --git a/triton/ref/hardware.rst b/triton/ref/hardware.rst
@@ -12,12 +12,10 @@
    fn3              | 1                 | Dell PowerEdge R940    | 2020    | avx2 avx512           | 4x20 core `Xeon Gold 6148 <https://ark.intel.com/products/120489>`__ 2.40GHz                                                                         | 2TB DDR4-2666        | EDR |   | No disk
    gpu[1-10]        | 10                | Dell PowerEdge C4140   | 2020    | skl avx2 avx512 volta | 2x8  core Intel Xeon Gold 6134 @ 3.2GHz                                                                                                              | 384GB DDR4-2667        | EDR | 4x `V100 <https://www.nvidia.com/en-us/data-center/tesla-v100>`__ 32GB | 1.5 TB SSD
    gpu[11-17,38-44] | 14                | Dell PowerEdge XE8545  | 2021, 2023| milan avx2 ampere a100 | 2x24  core AMD EPYC 7413 @ 2.65GHz                                                                                                                   | 503GB DDR4-3200        | EDR | 4x `A100 <https://www.nvidia.com/en-us/data-center/a100/>`__ 80GB | 440 GB SSD
-   gpu[20-22]       | 3                 | Dell PowerEdge C4130   | 2016    | hsw avx2 kepler       | 2x6 core `Xeon E5 2620 v3 <https://ark.intel.com/products/83352/Intel-Xeon-Processor-E5-2620-v3-15M-Cache-2_40-GHz>`__ 2.50GHz                       | 128GB DDR4-2133        | EDR | 4x2 GPU `K80 <https://www.nvidia.com/en-gb/data-center/tesla-k80/>`__ | 440 GB SSD
-   gpu[23-27]       | 5                 | Dell PowerEdge C4130   | 2017    | hsw avx2 pascal       | 2x12 core Xeon E5-2680 v3 @ 2.5GHz                                                                                                                   | 256GB DDR4-2400        | EDR | 4x `P100 <https://www.nvidia.com/object/tesla-p100.html>`__ | 720 GB SSD
    gpu[28-37]       | 10                | Dell PowerEdge C4140   | 2019    | skl avx2 avx512 volta | 2x8  core Intel Xeon Gold 6134 @ 3.2GHz                                                                                                              | 384GB DDR4-2667        | EDR | 4x `V100 <https://www.nvidia.com/en-us/data-center/v100/>`__ 32GB | 1.5 TB SSD
-   dgx[1-2,8-27]    | 22                 | Nvidia DGX-1           | 2018, 2025| bdw avx2 volta      | 2x20 core `Xeon E5-2698 v4 @ 2.2GHz <https://ark.intel.com/products/91753/Intel-Xeon-Processor-E5-2698-v4-50M-Cache-2_20-GHz>`__                     | 512GB DDR4-2133        | EDR | 8x `V100 <https://www.nvidia.com/en-us/data-center/v100/>`__ 16GB | 7 TB SSD
+   dgx[1-2,8-27]    | 22                | Nvidia DGX-1           | 2018, 2025| bdw avx2 volta      | 2x20 core `Xeon E5-2698 v4 @ 2.2GHz <https://ark.intel.com/products/91753/Intel-Xeon-Processor-E5-2698-v4-50M-Cache-2_20-GHz>`__                     | 512GB DDR4-2133        | EDR | 8x `V100 <https://www.nvidia.com/en-us/data-center/v100/>`__ 16GB | 7 TB SSD
    dgx[3,5-7]       | 4                 | Nvidia DGX-1           | 2018    | bdw avx2 volta        | 2x20 core `Xeon E5-2698 v4 @ 2.2GHz <https://ark.intel.com/products/91753/Intel-Xeon-Processor-E5-2698-v4-50M-Cache-2_20-GHz>`__                     | 512GB DDR4-2133        | EDR | 8x `V100 <https://www.nvidia.com/en-us/data-center/v100/>`__ 32GB| 7 TB SSD
-   gpuamd1          | 1                 | Dell PowerEdge R7525   | 2021    | rome avx2 mi100       | 2x8  core AMD EPYC 7262 @3.2GHz                                                                                                                      | 250GB DDR4-3200        | EDR | 3x `MI100 <https://www.amd.com/en/products/server-accelerators/instinct-mi100>`__ | 32GB SSD
+   gpuamd1          | 1                 | Dell PowerEdge R7525   | 2021    | rome avx2 mi100       | 2x8  core AMD EPYC 7262 @3.2GHz                                                                                                                      | 250GB DDR4-3200       1 | EDR | 2x `MI210 <https://www.amd.com/en/products/accelerators/instinct/mi200/mi210.html>`__, 1x `MI100 <https://www.amd.com/en/products/accelerators/instinct/mi100.html>`__ | 32GB SSD
    gpu[45-48]       | 4                 | Dell PowerEdge XE8640  | 2024    | saphr avx2 h100 hopper    | 2x48 core `Xeon Platinum 8468 <https://www.intel.com/content/www/us/en/products/sku/231735/intel-xeon-platinum-8468-processor-105m-cache-2-10-ghz/specifications.html>`__   2.1GHz            | 1024GB DDR5-4800        | HDR | 4x `H100 SXM <https://www.nvidia.com/en-us/data-center/h100/>`__ 80GB | 21 TB SSD
-   gpu[49]          | 1                 | Dell PowerEdge XE9680  | 2024    | emerald avx2 h200 hopper    | 2x32 core `Xeon® Platinum 8562Y+ <https://www.intel.com/content/www/us/en/products/sku/237558/intel-xeon-platinum-8562y-processor-60m-cache-2-80-ghz/specifications.html>`__   2.8GHz            | 2048GB DDR5-5600        | HDR | 8x `H200 SXM <https://www.nvidia.com/en-us/data-center/h200/>`__ each split to 7x18GB | 20 TB SSD
-   gpu[50-63]       | 14                | Dell PowerEdge XE9680  | 2024    | emerald avx2 h200 hopper    | 2x32 core `Xeon® Platinum 8562Y+ <https://www.intel.com/content/www/us/en/products/sku/237558/intel-xeon-platinum-8562y-processor-60m-cache-2-80-ghz/specifications.html>`__   2.8GHz            | 2048GB DDR5-5600        | HDR | 8x `H200 SXM <https://www.nvidia.com/en-us/data-center/h200/>`__ 141GB | 20 TB SSD
+   gpu[49]          | 1                 | Dell PowerEdge XE9680  | 2025    | emerald avx2 h200 hopper    | 2x32 core `Xeon® Platinum 8562Y+ <https://www.intel.com/content/www/us/en/products/sku/237558/intel-xeon-platinum-8562y-processor-60m-cache-2-80-ghz/specifications.html>`__   2.8GHz            | 2048GB DDR5-5600        | HDR | 8x `H200 SXM <https://www.nvidia.com/en-us/data-center/h200/>`__ each split to 3x35GB | 20 TB SSD
+   gpu[50-63]       | 14                | Dell PowerEdge XE9680  | 2025    | emerald avx2 h200 hopper    | 2x32 core `Xeon® Platinum 8562Y+ <https://www.intel.com/content/www/us/en/products/sku/237558/intel-xeon-platinum-8562y-processor-60m-cache-2-80-ghz/specifications.html>`__   2.8GHz            | 2048GB DDR5-5600        | HDR | 8x `H200 SXM <https://www.nvidia.com/en-us/data-center/h200/>`__ 141GB | 20 TB SSD