You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+14-7Lines changed: 14 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,21 +7,24 @@ This file is used to list changes made in each version of the AWS ParallelCluste
7
7
------
8
8
9
9
**ENHANCEMENTS**
10
-
- Add support for p6e-gb200 instances via capacity blocks.
11
-
- Add `build-image` support for kernel 6.12 of Amazon Linux 2023. The official ParallelCluster Amazon Linux 2023 AMIs use kernel 6.12.
10
+
- Add support for P6e-GB200 instances. ParallelCluster sets up Slurm topology plugin to handle P6e-GB200 UltraServers. See limitations section for important additional setup requirements.
11
+
- Add `build-image` support for Amazon Linux 2023 AMIs based on kernel 6.12 (in addition to 6.1).
12
+
13
+
**LIMITATIONS**
14
+
- P6e-GB200 instances are only tested on Amazon Linux 2023, Ubuntu 22.04 and Ubuntu 24.04.
15
+
- Using IMEX on P6e-GB200 requires additional setup. Please refer to <PLACE_HOLDER for the tutorial link>.
12
16
13
17
**CHANGES**
14
18
- Install nvidia-imex for all OSs except AL2.
15
-
- Ubuntu 20.04 is no longer supported.
16
19
- Remove `berkshelf`. All cookbooks are local and do not need `berkshelf` dependency management.
17
20
- Remove `UnkillableStepTimeout` from slurm.conf and let slurm set this value.
18
21
- Upgrade Slurm to version 24.11.6 (from 24.05.8).
19
-
- Upgrade EFA installer to 1.43.2 (from 1.41.0).
20
-
- Efa-driver: efa-2.17.2-1
22
+
- Upgrade EFA installer to 1.42.0 (from 1.41.0).
23
+
- Efa-driver: efa-2.15.3-1
21
24
- Efa-config: efa-config-1.18-1
22
25
- Efa-profile: efa-profile-1.7-1
23
-
- Libfabric-aws: libfabric-aws-2.1.0-5
24
-
- Rdma-core: rdma-core-58.0-1
26
+
- Libfabric-aws: libfabric-aws-2.1.0-3
27
+
- Rdma-core: rdma-core-57.0-1
25
28
- Open MPI: openmpi40-aws-4.1.7-2 and openmpi50-aws-5.0.6-11
26
29
- Upgrade Cinc Client to version 18.4.12 (from 18.2.7).
27
30
- Upgrade NVIDIA driver to version 570.172.08 (from 570.86.15) for all OSs except AL2.
@@ -31,11 +34,15 @@ This file is used to list changes made in each version of the AWS ParallelCluste
31
34
- Upgrade Python to 3.9.23 (from 3.9.20) for AL2.
32
35
- Upgrade Intel MPI Library to 2021.16.0 (from 2021.13.1).
33
36
- Upgrade DCV to version 2024.0-19030.
37
+
- Upgrade the official ParallelCluster Amazon Linux 2023 AMIs to kernel 6.12 (from 6.1).
34
38
35
39
**BUG FIXES**
36
40
- Fix a race condition in CloudWatch Agent startup that could cause nodes bootstrap failures.
37
41
- Fix cluster id mismatch issue by deleting the file `/var/spool/slurm.state/clustername` before configuring Slurm accounting.
0 commit comments