Update README and bump versions in requirements.yml

jovial · jovial · commit 54ed6b643d64 · 2025-05-21T17:15:50.000+01:00
diff --git a/docs/mig.md b/docs/mig.md
@@ -151,20 +151,10 @@ Use the ``vgpu`` metadata option to enable creation of mig devices on rebuild.
 
 ## GRES configuration
 
-You should stop terraform templating out partitions.yml and specify `openhpc_nodegroups` manually. To do this
-set the `autogenerated_partitions_enabled` terraform variable to `false`. For example (`environments/production/tofu/main.tf`):
-
-```
-module "cluster" {
-  source = "../../site/tofu/"
-  ...
-  # We manually populate this to add GRES. See environments/site/inventory/group_vars/all/partitions-manual.yml.
-  autogenerated_partitions_enabled = false
-}
-```
-
-GPU types can be determined by deploying slurm without any gres configuration and then running
-`sudo slurmd -G` on a compute node where GPU resources exist. An example is shown below:
+GPU resources need to be added to the OpenHPC nodegroup definitions (`openhpc_nodegroups`). To
+do this you need to determine the names of the GPU types as detected by slurm. First
+deploy slurm with the default nodegroup definitions to get a working cluster. you will then be
+able to run: `sudo slurmd -G` on a compute node where GPU resources exist. An example is shown below:
 
 ```
 [rocky@io-io-gpu-02 ~]$ sudo slurmd -G
@@ -191,7 +181,7 @@ ENV_NVML,ENV_RSMI,ENV_ONEAPI,ENV_OPENCL,ENV_DEFAULT
 ```
 
 GRES resources can then be configured manually. An example is shown below
-(`environments/<environment>/inventory/group_vars/all/partitions-manual.yml`):
+(`environments/<environment>/inventory/group_vars/all/openhpc.yml`):
 
 ```
 openhpc_partitions:
@@ -207,3 +197,5 @@ openhpc_nodegroups:
         - conf: "gpu:nvidia_h100_80gb_hbm3_4g.40gb:2"
         - conf: "gpu:nvidia_h100_80gb_hbm3_1g.10gb:6"
 ```
+
+Making sure the types (the identifier after `gpu:`) match those collected with `slurmd -G`.
diff --git a/requirements.yml b/requirements.yml
@@ -55,7 +55,6 @@ collections:
     version: 0.0.15
   - name: stackhpc.pulp
     version: 0.5.5
-  - name: https://github.com/stackhpc/ansible-collection-linux
-    type: git
-    version: feature/mig-only
+  - name: stackhpc.linux
+    version: 1.4.0
 ...