You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Add support for attaching existing FSx for Ontap and FSx for OpenZFS File Systems.
13
-
- Add support for FSx Lustre Persistent_2 deployment type.
14
-
- Add support for memory-based scheduling in Slurm.
15
-
- Configure `RealMemory` on compute nodes by default as 95% of the EC2 memory.
16
-
- Add new configuration parameter `Scheduling/SlurmSettings/EnableMemoryBasedScheduling` to configure memory-based scheduling in Slurm.
8
+
- Add support for memory-based job scheduling in Slurm
9
+
- Configure compute nodes real memory in the Slurm cluster configuration.
10
+
- Add new configuration parameter `Scheduling/SlurmSettings/EnableMemoryBasedScheduling` to enable memory-based scheduling in Slurm.
17
11
- Add new configuration parameter `Scheduling/SlurmQueues/ComputeResources/SchedulableMemory` to override default value of the memory seen by the scheduler on compute nodes.
12
+
- Improve flexibility on cluster configuration updates to avoid the stop and start of the entire cluster whenever possible.
13
+
- Add new configuration parameter `Scheduling/SlurmSettings/QueueUpdateStrategy` to set the preferred strategy to adopt for compute nodes needing a configuration update and replacement.
14
+
- Improve failover mechanism over available compute resources when hitting insufficient capacity issues with EC2 instances. Disable compute nodes by a configurable amount of time (default 10 min) when a node launch fails due to insufficient capacity.
15
+
- Add support to mount existing FSx for ONTAP and FSx for OpenZFS file systems.
16
+
- Add support to mount multiple instances of existing EFS, FSx for Lustre / for ONTAP/ for OpenZFS file systems.
17
+
- Add support for FSx for Lustre Persistent_2 deployment type when creating a new file system.
18
18
- Prompt user to enable EFA for supported instance types when using `pcluster configure` wizard.
19
-
- Change default EBS volume types from gp2 to gp3 in both the root and additional volumes.
20
19
- Add support for rebooting compute nodes via Slurm.
20
+
- Improved handling of Slurm power states to also account for manual powering down of nodes.
21
+
- Add NVIDIA GDRCopy 2.3 into the product AMIs to enable low-latency GPU memory copy.
21
22
22
23
**CHANGES**
23
-
- Remove support for Python 3.6.
24
-
- Upgrade Slurm to version 21.08.8-2.
25
-
- Do not require `PlacementGroup/Enabled` to be set to `true` when passing an existing `PlacementGroup/Id`.
24
+
- Upgrade EFA installer to version 1.17.2
25
+
- EFA driver: ``efa-1.16.0-1``
26
+
- EFA configuration: ``efa-config-1.10-1``
27
+
- EFA profile: ``efa-profile-1.5-1``
28
+
- Libfabric: ``libfabric-aws-1.16.0~amzn2.0-1``
29
+
- RDMA core: ``rdma-core-41.0-2``
30
+
- Open MPI: ``openmpi40-aws-4.1.4-2``
31
+
- Upgrade NICE DCV to version 2022.0-12760.
32
+
- Upgrade NVIDIA driver to version 470.129.06.
33
+
- Upgrade NVIDIA Fabric Manager to version 470.129.06.
34
+
- Change default EBS volume types from gp2 to gp3 for both the root and additional volumes.
26
35
- Changes to FSx for Lustre file systems created by ParallelCluster:
27
36
- Change the default deployment type to `Scratch_2`.
28
37
- Change the Lustre server version to `2.12`.
29
-
-Add `lambda:ListTags` and `lambda:UntagResource` to `ParallelClusterUserRole` used by ParallelCluster API stack for cluster update.
30
-
- Add `parallelcluster:cluster-name` tag to all resources created by ParallelCluster.
38
+
-Do not require `PlacementGroup/Enabled` to be set to `true` when passing an existing `PlacementGroup/Id`.
39
+
- Add `parallelcluster:cluster-name` tag to all the resources created by ParallelCluster.
31
40
- Do not allow setting `PlacementGroup/Id` when `PlacementGroup/Enabled` is explicitly set to `false`.
32
-
- Restrict IPv6 access to IMDS to root and cluster admin users only, when configuration parameter `HeadNode/Imds/Secured` is enabled.
33
-
- Change the default root volume size from 35 GiB to the size of AMIs. The default can be overwritten in cluster configuration file.
41
+
- Add `lambda:ListTags` and `lambda:UntagResource` to `ParallelClusterUserRole` used by ParallelCluster API stack for cluster update.
42
+
- Restrict IPv6 access to IMDS to root and cluster admin users only, when configuration parameter `HeadNode/Imds/Secured` is true as by default.
43
+
- With a custom AMI, use the AMI root volume size instead of the ParallelCluster default of 35 GiB. The value can be changed in cluster configuration file.
34
44
- Automatic disabling of the compute fleet when the configuration parameter `Scheduling/SlurmQueues/ComputeResources/SpotPrice`
35
45
is lower than the minimum required Spot request fulfillment price.
36
-
- Show `requested_value` and `current_value` values in the change set when adding or removing a section.
37
-
- Do not replace dynamic node in POWER_DOWN as jobs may be still running.
46
+
- Show `requested_value` and `current_value` values in the change set when adding or removing a section during an update.
47
+
- Disable `aws-ubuntu-eni-helper` service in DLAMI to avoid conflicts with `configure_nw_interface.sh` when configuring instances with multiple network cards.
48
+
- Remove support for Python 3.6.
38
49
39
50
**BUG FIXES**
40
-
- Fix default for disable validate and test components when building custom AMI. The default was to disable those components, but it wasn't effective.
41
-
- Handle corner case in the scaling logic when instance is just launched and the describe instances API doesn't report yet all the EC2 info.
42
-
- Dropped validation that would prevent ARM instance type to be used when `DisableSimultaneousMultithreading` was set to true.
43
-
- Fix resource pattern used for the ListImagePipelineImages Action in the EcrImageDeletionLambdaRole. This is causing a stack update failure when upgrading ParallelCluster API from one version to another.
44
-
- Add missing permissions needed to import/export from S3 when using FSx for Lustre via ParallelCluster API.
51
+
- Fix the default behavior to skip the ParallelCluster validation and test steps when building a custom AMI.
52
+
- Fix file handle leak in `computemgtd`.
53
+
- Fix race condition that was sporadically causing launched instances to be immediately terminated because not available yet in EC2 DescribeInstances response
54
+
- Fix support for `DisableSimultaneousMultithreading` parameter on instance types with Arm processors.
55
+
- Fix ParallelCluster API stack update failure when upgrading from a previus version. Add resource pattern used for the `ListImagePipelineImages` action in the `EcrImageDeletionLambdaRole`.
56
+
- Fix ParallelCluster API adding missing permissions needed to import/export from S3 when creating an FSx for Lustre storage.
0 commit comments