Releases: aws/aws-parallelcluster
AWS ParallelCluster v2.10.0
We're excited to announce the release of AWS ParallelCluster 2.10.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Add support for CentOS 8 in all Commercial regions.
- Add support for P4d instance type as compute node.
- Add the possibilty to enable NVIDIA GPUDirect RDMA support on EFA by using the new
enable_efa_gdrconfiguration
parameter. - Enable support for NICE DCV in GovCloud regions.
- Enable support for AWS Batch scheduler in GovCloud regions.
- FSx Lustre:
- Add possibility to configure Auto Import policy through the new
auto_import_policyparameter. - Add support to HDD storage type and the new
storage_typeanddrive_cache_typeconfiguration parameters.
- Add possibility to configure Auto Import policy through the new
- Create a CloudWatch Dashboard for the cluster, named
<clustername>-<region>, including head node EC2 metrics and
cluster logs. It can be disabled by configuring theenableparameter in thedashboardsection. - Add
-r/-regionarg topcluster configurecommand. If this arg is provided, configuration will
skip region selection. - Add
-r/-regionarg tosshanddcv connectcommands. - Add
cluster_resource_bucketparameter underclustersection to allow the user to specify an existing S3 bucket. createami:- Add validation step to fail when using a base AMI created by a different version of ParallelCluster.
- Add validation step for AMI creation process to fail if the selected OS and the base AMI OS are not consistent.
- Add
--post-installparameter to use a post installation script when building an AMI. - Add the possibility to use a ParallelCluster base AMI.
- Add possibility to change tags when performing a
pcluster update. - Add new
all_or_nothing_batchconfiguration parameter forslurm_resumescript. WhenTrue,slurm_resumewill
succeed only if all the instances required by all the pending jobs in Slurm will be available. - Enable queue resizing on update without requiring to stop the compute fleet. Stopping the compute fleet is only
necessary when existing instances risk to be terminated. - Add validator for EBS volume size, type and IOPS.
- Add validators for
shared_dirparameter when used in bothclusterandebssections. - Add validator
cfn_scheduler_slotskey in theextra_jsonparameter.
CHANGES
- CentOS 6 is no longer supported.
- Upgrade EFA installer to version 1.10.1
- EFA configuration:
efa-config-1.5(from efa-config-1.4) - EFA profile:
efa-profile-1.1(from efa-profile-1.0.0) - EFA kernel module:
efa-1.10.2(from efa-1.6.0) - RDMA core:
rdma-core-31.amzn0(from rdma-core-28.amzn0) - Libfabric:
libfabric-1.11.1amzn1.1(from libfabric-1.10.1amzn1.1) - Open MPI:
openmpi40-aws-4.0.5(from openmpi40-aws-4.0.3) - Unifies installer runtime options across x86 and aarch64
- Introduces
-g/--enable-gdrswitch to install packages with GPUDirect RDMA support - Updates to OMPI collectives decision file packaging, migrated from efa-config to efa-profile
- Introduces CentOS 8 support
- EFA configuration:
- Upgrade NVIDIA driver to version 450.80.02.
- Install NVIDIA Fabric manager to enable NVIDIA NVSwitch on supported platforms.
- Remove default region
us-east-1. After the change,pclusterwill adhere to the following lookup order for region:-r/--regionarg.AWS_DEFAULT_REGIONenvironment variable.aws_region_namein ParallelCluster configuration file.regionin AWScli configuration file.
- Slurm: change
SlurmctldPortto 6820-6829 to not overlap with defaultslurmdbdport (6819). - Slurm: add
compute_resourcename andefaas node features. - Remove validation on
ec2_iam_roleparameter. - Improve retrieval of instance type info by using
DescribeInstanceTypeAPI. - Remove
custom_awsbatch_template_urlconfiguration parameter. - Upgrade
pipto latest version in virtual environments. - Upgrade image used by CodeBuild environment when building container images for Batch clusters, from
aws/codebuild/amazonlinux2-x86_64-standard:1.0toaws/codebuild/amazonlinux2-x86_64-standard:3.0.
BUG FIXES
- Retrieve the right number of compute instance slots when instance type is updated.
- Include user tags in compute nodes and EBS volumes.
- Fix
pcluster statusoutput when head node is stopped. pcluster update:- Fix issue when tags are specified but not changed.
- Fix issue when the
clustersection label changed. - Fix issue when
shared_dirandebs_settingsare both configured in theclustersection. - Fix
clusterandcfnclustercompatibility inextra_jsonparameter.
- Fix
pcluster configureto avoid using default/initial values for internal parameter initialization. - Fix pre/post install script arguments management when using double quotes.
- Fix a bug that was causing
clustermgtdandcomputemgtdsleep interval to be incorrectly computed when
system timezone is not set to UTC. - Fix queue name validator to properly check for capital letters.
- Fix
enable_efaparameter validation forqueuesection. - Fix CloudWatch Log Group creation for AWS Lambda functions handling CloudFormation Custom Resources.
AWS ParallelCluster v2.9.1
We're excited to announce the release of AWS ParallelCluster 2.9.1.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
Bugfixes
- Fix cluster creation with the head node in a private subnet when it doesn't get a public IP.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.9.0
We're excited to announce the release of AWS ParallelCluster 2.9.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Add support for multiple queues and multiple instance types feature with the Slurm scheduler.
- Extend NICE DCV support to ARM instances.
- Extend support to disable hyperthreading on instances (like *.metal) that don't support CpuOptions in LaunchTemplate.
- Enable support for NFS 4 for the filesystems shared from the head node.
- Add CLI utility to convert configuration files with Slurm scheduler to new format to support multiple queues configuration.
- Add script wrapper to support Torque-like commands with the Slurm scheduler.
- Remove dependency on cfn-init in compute nodes bootstrap in order to avoid throttling and delays caused by CloudFormation when a large number of compute nodes join the cluster.
CHANGES
- Introduce new configuration sections and parameters to support multiple queues and multiple instance types.
- Optimize scaling logic with Slurm scheduler, no longer based on Auto Scaling groups.
- A Route53 private hosted zone is now created together with the cluster and used in DNS resolution inside cluster nodes when using Slurm scheduler.
- Upgrade EFA installer to version 1.9.5:
- EFA configuration:
efa-config-1.4(from efa-config-1.3) - EFA profile:
efa-profile-1.0.0 - EFA kernel module:
efa-1.6.0(no change) - RDMA core:
rdma-core-28.amzn0(no change) - Libfabric:
libfabric-1.10.1amazon1.1(no change) - Open MPI:
openmpi40-aws-4.0.3(no change)
- EFA configuration:
- Upgrade Slurm to version 20.02.4.
- Apply the following changes to Slurm configuration:
- Assign a range of 10 ports to Slurmctld in order to better perform with large cluster settings
- Configure cloud scheduling logic
- Set
ReconfigFlags=KeepPartState - Set
MessageTimeout=60 - Set
TaskPlugin=task/affinity,task/cgrouptogether withTaskAffinity=noandConstrainCores=yesin cgroup.conf
- Upgrade NICE DCV to version 2020.1-9012.
- Use private IP instead of master node hostname when mounting shared NFS drives.
- Add new log streams to CloudWatch: chef-client, clustermgtd, computemgtd, slurm_resume, slurm_suspend.
- Add support for queue names in pre/post install scripts.
- Use PAY_PER_REQUEST billing mode for DynamoDb table in govcloud regions.
BUG FIXES
- Solve dpkg lock issue with Ubuntu that prevented custom AMI creation in some cases.
- Add/improve sanity checks for some configuration parameters.
- Prevent ignored changes from being reported in
pcluster updateoutput. - Fix incompatibility issues with python 2.7 for
pcluster update. - Fix SNS Topic Subscriptions not being deleted with cluster's CloudFormation stack.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.8.1
We're excited to announce the release of AWS ParallelCluster 2.8.1.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
Changes
- Disable screen lock for DCV desktop sessions to prevent users from being locked out.
Bugfixes
- Fix
pcluster configurecommand to avoid writing unexpected configuration parameters.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.8.0
We're excited to announce the release of AWS ParallelCluster 2.8.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Enable support for ARM instances on Ubuntu 18.04 and Amazon Linux 2.
- Add support for the automatic backup features of FSx file systems.
- Renewed user experience and robustness of cluster update functionality.
- Support DCV and EFS in China regions.
- Use DescribeInstanceTypes API to validate whether an instance type is EFA-enabled so that new EFA instances can
be used without requiring an update to the ParallelCluster configuration files. - Enable Slurm to directly launch tasks and initialize communications through PMIx v3.1.5 on all supported
operating systems except for CentOS 6. - Print a warning when using NICE DCV on micro or nano instances.
CHANGES
- Remove the client requirement to have Berkshelf to build a custom AMI.
- Upgrade EFA installer to version 1.9.4:
- Kernel module:
efa-1.6.0(from efa-1.5.1) - RDMA core:
rdma-core-28.amzn0(from rdma-core-25.0) - Libfabric:
libfabric-1.10.1amazon1.1(updated from libfabric-aws-1.9.0amzn1.1) - Open MPI: openmpi40-aws-4.0.3 (no change)
- Kernel module:
- Avoid unnecessary validation of IAM policies.
- Removed unused dependency on supervisor from the Batch Dockerfile.
- Move all LogGroup definitions in the CloudFormation templates into the CloudWatch substack.
- Disable libvirtd service on CentOS 7. Virtual bridge interfaces are incorrectly detected by Open MPI and
cause MPI applications to hang, see https://www.open-mpi.org/faq/?category=tcp#tcp-selection for details - Use CINC instead of Chef for provisioning instances. See https://cinc.sh/about/ for details.
- Retry when mounting an NFS mount fails.
- Install the
pyenvvirtual environments used by ParallelCluster cookbook and node daemon code under
/opt/parallelcluster instead of under /usr/local. - Use the new official CentOS 7 AMI as the base images for ParallelCluster AMI.
- Upgrade NVIDIA driver to Tesla version 440.95.01 on CentOS 6 and version 450.51.05 on all other distros.
- Upgrade CUDA library to version 11.0 on all distros besides CentOS 6.
- Install third-party cookbook dependencies via local source, rather than using the Chef supermarket.
- Use https wherever possible in download URLs.
- Install glibc-static, which is required to support certain options for the Intel MPI compiler.
- Require an initial cluster size greater than zero when the option to maintain the initial cluster size is used.
BUG FIXES
- Fix validator for CIDR-formatted IP range parameters.
- Fix issue that was preventing concurrent use of custom node and pcluster CLI packages.
- Use the correct domain name when contacting AWS services from the China partition.
AWS ParallelCluster v2.7.0
We're excited to announce the release of AWS ParallelCluster 2.7.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
sqswatcher: The daemon is now compatible with VPC Endpoints so that SQS messages can be passed without traversing the public internet.
CHANGES
- Upgrade NICE DCV to version 2020.0-8428.
- Upgrade Intel MPI to version U7.
- Upgrade NVIDIA driver to version 440.64.00.
- Upgrade EFA installer to version 1.8.4:
- Kernel module:
efa-1.5.1(no change) - RDMA core:
rdma-core-25.0(no change) - Libfabric:
libfabric-aws-1.9.0amzn1.1(no change) - Open MPI: openmpi40-aws-4.0.3 (updated from openmpi40-aws-4.0.2)
- Kernel module:
- Upgrade CentOS 7 AMI to version 7.8
- Configuration: base_os and scheduler parameters are now mandatory and they have no longer a default value.
BUG FIXES
- Fix recipes installation at runtime by adding the bootstrapped file at the end of the last chef run.
- Fix installation of FSx Lustre client on Centos 7
- FSx Lustre: Exit with error when failing to retrieve FSx mountpoint
- Fix sanity_check behavior when
max queue_size> 1000
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.6.1
We're excited to announce the release of AWS ParallelCluster 2.6.1.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Improved management of S3 bucket that gets created when
awsbatchscheduler is selected. - Add validation for supported OSes when using FSx Lustre.
- Change ProctrackType from proctrack/gpid to proctrack/cgroup in Slurm in order to better handle termination of stray processes when running MPI applications. This also includes the creation of a cgroup Slurm configuration in in order to enable the cgroup plugin.
- Skip execution, at node bootstrap time, of all those install recipes that are already applied at AMI creation time.
- Start CloudWatch agent earlier in the node bootstrapping phase so that cookbook execution failures are correctly uploaded and are available for troubleshooting.
- Improved the management of SQS messages and retries to speed-up recovery times when failures occur.
CHANGES
- FSx Lustre: remove
x-systemd.requires=lnet.servicefrom mount options in order to rely on default lnet setup provided by Lustre. - Enforce Packer version to be >= 1.4.0 when building an AMI. This is also required for customers using
pcluster createamicommand. - Do not launch a replacement for an unhealthy or unresponsive node until this is terminated. This makes cluster slower at provisioning new nodes when failures occur but prevents any temporary over-scaling with respect to the expected capacity.
- Increase parallelism when starting
slurmdon compute nodes that join the cluster from 10 to 30. - Reduce the verbosity of messages logged by the node daemons.
- Do not dump logs to
/home/logswhen nodewatcher encounters a failure and terminates the node. CloudWatch can be used to debug such failures. - Reduce the number of retries for failed REMOVE events in sqswatcher.
- Omit cfn-init-cmd and cfn-wire from the files stored in CloudWatch logs.
BUG FIXES
- Configure proxy during cloud-init boothook in order for the proxy to be configured for all bootstrap actions.
- Fix installation of Intel Parallel Studio XE Runtime that requires yum4 since version 2019.5.
- Fix compilation of Torque scheduler on Ubuntu 18.04.
- Fixed a bug in the ordering and retrying of SQS messages that was causing, under certain circumstances of heavy load, the scheduler configuration to be left in an inconsistent state.
- Delete from queue the REMOVE events that are discarded due to hostname collision with another event fetched as part of the same
sqswatcheriteration.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.6.0
We're excited to announce the release of AWS ParallelCluster 2.6.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
Enhancements
- Add support for Amazon Linux 2
- Add support for NICE DCV on Ubuntu 18.04
- Add support for FSx Lustre on Ubuntu 18.04 and Ubuntu 16.04
- New CloudWatch logging capability to collect cluster and job scheduler logs to CloudWatch for cluster monitoring and inspection
- Add
--keep-logsflag topcluster deletecommand to preserve logs at cluster deletion
- Add
- Install and setup Amazon Time Sync on all OSs
- Enable accounting plugin in Slurm for all OSes. Note: accounting is not enabled nor configured by default
- Add retry on throttling from CloudFormation API, happening when several compute nodes are being bootstrapped
concurrently - Display detailed substack failures when
pcluster createfails due to a substack error - Create additional EFS mount target in the AZ of compute subnet, if needed
- Add validator for FSx Lustre Weekly Maintenance Start Time parameter
- Add validator to the KMS key provided for EBS, FSx, and EFS
- Add validator for S3 external resource
- Support two new FSx Lustre features, Scratch 2 and Persistent filesystems
- Add two new parameters
deployment_typeandper_unit_storage_throughputto thefsxsection - Add new storage sizes
storage_capacity, 1,200 GiB, 2,400 GiB and multiples of 2,400 are supported withSCRATCH_2 - In transit encryption is available via
fsx_kms_key_idparameter whendeployment_type = PERSISTENT_1 - New parameter
per_unit_storage_throughputis available whendeployment_type = PERSISTENT_1
- Add two new parameters
Changes
- Upgrade Slurm to version 19.05.5
- Upgrade Intel MPI to version U6
- Upgrade EFA installer to version 1.8.3:
- Kernel module: efa-1.5.1 (updated from efa-1.4.1)
- RDMA core: rdma-core-25.0 (distributed only) (no change)
- Libfabric: libfabric-aws-1.9.0amzn1.1 (updated from libfabric-aws-1.8.1amzn1.3)
- Open MPI: openmpi40-aws-4.0.2 (no change)
- Install Python 2.7.17 on CentOS 6 and set it as default through pyenv
- Install Ganglia from repository on Amazon Linux, Amazon Linux 2, CentOS 6 and CentOS 7
- Disable StrictHostKeyChecking for SSH client when target host is inside cluster VPC for all OSs except CentOS 6
- Pin Intel Python 2 and Intel Python 3 to version 2019.4
- Automatically disable ptrace protection on Ubuntu 18.04 and Ubuntu 16.04 compute nodes when EFA is enabled.
This is required in order to use local memory for interprocess communications in Libfabric provider
as mentioned here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-ptrace - Packer version >= 1.4.0 is required for AMI creation
- Use version 5.2 of PyYAML for python 3 versions of 3.4 or earlier.
Bug Fixes
- Fix issue with slurmd daemon not being restarted correctly when a compute node is rebooted
- Fix errors causing Torque not able to locate jobs, setting server_name to fqdn on master node
- Fix Torque issue that was limiting the max number of running jobs to the max size of the cluster
- Fix OS validation depending on the configured scheduler
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.5.1
We're excited to announce the release of AWS ParallelCluster 2.5.1.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
Enhancements
- Add
--show-urlflag topcluster dcv connectcommand in order to generate a one-time URL that can be used to start a DCV session. This unblocks the usage of DCV when the browser cannot be launched automatically.
Changes
- Upgrade CUDA library to version 10.2.
- Using a Placement Group is not required anymore but highly recommended when enabling EFA.
- Increase default root volume size in Centos 6 AMI to 25GB.
- Increase the retention of CloudWatch logs produced when generating AWS Batch Docker images from 1 to 14 days.
- Increase the total time allowed to build Docker images from 20 minutes to 30 minutes. This is done to better deal with slow networking in China regions.
- Upgrade EFA installer to version 1.7.1:
- Kernel module:
efa-1.4.1 - RDMA core:
rdma-core-25.0 - Libfabric:
libfabric-aws-1.8.1amzn1.3 - Open MPI:
openmpi40-aws-4.0.2
- Kernel module:
Bug Fixes
- Fix installation of NVIDIA drivers on Ubuntu 18.
- Fix installation of CUDA toolkit on Centos 6.
- Fix invalid default value for
spot_price. - Fix issue that was preventing the cluster from being created in VPCs configured with multiple CIDR blocks.
- Correctly handle failures when retrieving ASG in
pcluster instancescommand. - Fix the default mount dir when a single EBS volume is specified through a dedicated ebs configuration section.
- Correctly handle failures when there is an invalid parameter in the
awsconfig section. - Fix a bug in
pcluster deletethat was causing the cli to exit with error when the cluster is successfully deleted. - Exit with status code 1 if
pcluster createfails to create a stack. - Better handle the case of multiple or no network interfaces on FSX filesystems.
- Fix
pcluster configureto retain default values from old config file. - Fix bug in sqswatcher that was causing the daemon to fail when more than 100 DynamoDB tables are present in the cluster region.
- Fix installation of Munge on Amazon Linux, Centos 6, Centos 7 and Ubuntu 16.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.5.0
We're excited to announce the release of AWS ParallelCluster 2.5.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
Enhancements
- Add support for new OS: Ubuntu 18.04
- Add support for AWS Batch scheduler in China partition and in
eu-north-1. - Revamped
pcluster configurecommand which now supports automated networking configuration. - Add support for NICE DCV on Centos 7 to setup a graphical remote desktop session on the Master node.
- Add support for new EFA supported instances:
c5n.metal,m5dn.24xlarge,m5n.24xlarge,r5dn.24xlarge,r5n.24xlarge - Add support for scheduling with GPU options in Slurm. Currently supports the following GPU-related options:
-G/--gpus, --gpus-per-task, --gpus-per-node, --gres=gpu, --cpus-per-gpu.
Integrated GPU requirements into scaling logic, cluster will scale automatically to satisfy GPU/CPU requirements for pending jobs. When submitting GPU jobs, CPU/node/task information is not required but preferred in order to avoid ambiguity. If only GPU requirements are specified, cluster will scale up to the minimum number of nodes required to satisfy all GPU requirements. - Add new cluster configuration option to automatically disable Hyperthreading (
disable_hyperthreading = true) - Install Intel Parallel Studio 2019.5 Runtime in Centos 7 when
enable_intel_hpc_platform = trueand share/opt/intelover NFS - Additional EC2 IAM Policies can now be added to the role ParallelCluster automatically creates for cluster nodes by simply specifying
additional_iam_policiesin the cluster config.
Changes
- Ubuntu 14.04 is no longer supported
- Upgrade Intel MPI to version U5.
- Upgrade EFA Installer to version 1.7.0, this also upgrades Open MPI to 4.0.2.
- Upgrade NVIDIA driver to Tesla version 418.87.
- Upgrade CUDA library to version 10.1.
- Upgrade Slurm to version 19.05.3-2.
- Install EFA in China AMIs.
- Increase default EBS volume size from 17GB to 25GB
- FSx Lustre now supports new storage_capacity options 1,200 and 2,400 GiB
- Enable
flock user_xattr noatimeLustre mount options by default everywhere and
x-systemd.automount x-systemd.requires=lnet.servicefor systemd based systems. - Increase the number of hosts that can be processed by scaling daemons in a single batch from 50 to 200. This improves the scaling time especially with increased ASG launch rates.
- Change default sshd config in order to disable X11 forwarding and update the list of supported ciphers.
- Increase faulty node termination timeout from 1 minute to 5 in order to give some additional time to the scheduler to recover when under heavy load.
- Extended
pcluster createamicommand to specify the VPC and network settings when building the AMI. - Support inline comments in config file
- Support Python 3.8 in pcluster CLI.
- Deprecate Python 2.6 support
- Add
ClusterNametag to EC2 instances. - Search for new available version only at
pcluster createaction. - Enable
sanity_checkby default.
Bug Fixes
- Fix sanity check for custom ec2 role. Fixes #1241.
- Fix bug when using same subnet for both master and compute.
- Fix bug when ganglia is enabled ganglia urls are shown. Fixes #1322.
- Fix bug with
awsbatchscheduler that prevented Multi-node jobs from running. - Fix jobwatcher behaviour that was marking nodes locked by the nodewatcher as busy even if they had been removed already from the ASG Desired count. This was causing, in rare circumstances, a cluster overscaling.
- Fix bug that was causing failures in sqswatcher when ADD and REMOVE event for the same host are fetched together.
- Fix bug that was preventing nodes to mount partitioned EBS volumes.
- Implement paginated calls in
pcluster list. - Fix bug when creating
awsbatchcluster with name longer than 31 chars - Fix a bug that lead to ssh not working after ssh'ing into a compute node by ip address.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192