Skip to content

Commit 8f5359f

Browse files
authored
Merge Release 2.4.1
Merge Release 2.4.1
2 parents f4d9378 + 22f26e0 commit 8f5359f

File tree

66 files changed

+2110
-1454
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+2110
-1454
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1+
**Please See** [Git Pull Request Instructions](https://github.com/aws/aws-parallelcluster/wiki/Git-Pull-Request-Instructions)
2+
13
*Issue #, if available:*
24

35
*Description of changes:*
46

5-
67
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

CHANGELOG.rst

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,54 @@
22
CHANGELOG
33
=========
44

5+
2.4.1
6+
=====
7+
8+
**ENHANCEMENTS**
9+
10+
* Add support for ap-east-1 region (Hong Kong)
11+
* Add possibility to specify instance type to use when building custom AMIs with ``pcluster createami``
12+
* Speed up cluster creation by having compute nodes starting together with master node
13+
* Enable ASG CloudWatch metrics for the ASG managing compute nodes
14+
* Install Intel MPI 2019u4 on Amazon Linux, Centos 7 and Ubuntu 1604
15+
* Upgrade Elastic Fabric Adapter (EFA) to version 1.4.1 that supports Intel MPI
16+
* Run all node daemons and cookbook recipes in isolated Python virtualenvs. This allows our code to always run with the
17+
required Python dependencies and solves all conflicts and runtime failures that were being caused by user packages
18+
installed in the system Python
19+
20+
* Torque:
21+
22+
* Process nodes added to or removed from the cluster in batches in order to speed up cluster scaling
23+
* Scale up only if required CPU/nodes can be satisfied
24+
* Scale down if pending jobs have unsatisfiable CPU/nodes requirements
25+
* Add support for jobs in hold/suspended state (this includes job dependencies)
26+
* Automatically terminate and replace faulty or unresponsive compute nodes
27+
* Add retries in case of failures when adding or removing nodes
28+
* Add support for ncpus reservation and multi nodes resource allocation (e.g. -l nodes=2:ppn=3+3:ppn=6)
29+
* Optimized Torque global configuration to faster react to the dynamic cluster scaling
30+
31+
**CHANGES**
32+
33+
* Update EFA installer to a new version, note this changes the location of ``mpicc`` and ``mpirun``.
34+
To avoid breaking existing code, we recommend you use the modulefile ``module load openmpi`` and ``which mpicc``
35+
for anything that requires the full path
36+
* Eliminate Launch Configuration and use Launch Templates in all the regions
37+
* Torque: upgrade to version 6.1.2
38+
* Run all ParallelCluster daemons with Python 3.6 in a virtualenv. Daemons code now supports Python >= 3.5
39+
40+
**BUG FIXES**
41+
42+
* Fix issue with sanity check at creation time that was preventing clusters from being created in private subnets
43+
* Fix pcluster configure when relative config path is used
44+
* Make FSx Substack depend on ComputeSecurityGroupIngress to keep FSx from trying to create prior to the SG
45+
allowing traffic within itself
46+
* Restore correct value for ``filehandle_limit`` that was getting reset when setting ``memory_limit`` for EFA
47+
* Torque: fix compute nodes locking mechanism to prevent job scheduling on nodes being terminated
48+
* Restore logic that was automatically adding compute nodes identity to SSH ``known_hosts`` file
49+
* Slurm: fix issue that was causing the ParallelCluster daemons to fail when the cluster is stopped and an empty compute nodes file
50+
is imported in Slurm config
51+
52+
553
2.4.0
654
=====
755

README.rst

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -83,22 +83,24 @@ You can view the running compute hosts:
8383
8484
For more information on any of these steps see the `Getting Started Guide`_.
8585

86-
.. _`Getting Started Guide`: https://aws-parallelcluster.readthedocs.io/en/latest/getting_started.html
86+
.. _`Getting Started Guide`: https://docs.aws.amazon.com/parallelcluster/latest/ug/getting_started.html
8787

8888
Documentation
8989
-------------
9090

91-
Documentation is part of the project and is published to -
92-
https://aws-parallelcluster.readthedocs.io/. Of most interest to new users is
93-
the Getting Started Guide -
94-
https://aws-parallelcluster.readthedocs.io/en/latest/getting_started.html.
91+
We've been working hard to greatly improve the `Documentation <https://docs.aws.amazon.com/parallelcluster/latest/ug/>`_, it's now published in 10 languages, one of the many benefits of being hosted on AWS Docs. Of most interest to new users is
92+
the `Getting Started Guide <https://docs.aws.amazon.com/parallelcluster/latest/ug/getting_started.html>`_.
93+
94+
If you have changes you would like to see in the docs, please either submit feedback using the feedback link at the bottom
95+
of each page or create an issue or pull request for the project at:
96+
https://github.com/awsdocs/aws-parallelcluster-user-guide.
9597

9698
Issues
9799
------
98100

99101
Please open a GitHub issue for any feedback or issues:
100102
https://github.com/aws/aws-parallelcluster. There is also an active AWS
101-
HPC forum which may be helpful:https://forums.aws.amazon.com/forum.jspa?forumID=192.
103+
HPC forum which may be helpful: https://forums.aws.amazon.com/forum.jspa?forumID=192.
102104

103105
Changes
104106
-------

amis.txt

Lines changed: 97 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -1,102 +1,102 @@
11
# alinux
2-
ap-northeast-1: ami-0dcc18768374b4441
3-
ap-northeast-2: ami-022e7c66ccb807c9f
4-
ap-northeast-3: ami-04402be7b85999df8
5-
ap-south-1: ami-0a14b1f0e7427a4bb
6-
ap-southeast-1: ami-02079735c20c1ac4e
7-
ap-southeast-2: ami-0c65952cdec26ae39
8-
ca-central-1: ami-01f28f8381746746f
9-
cn-north-1: ami-0da67c26ce2e8d111
10-
cn-northwest-1: ami-03dc8f759de9de690
11-
eu-central-1: ami-0ff6d2a86b9199e82
12-
eu-north-1: ami-0cb08caa10d113ed7
13-
eu-west-1: ami-0b5c32b12b9c340d0
14-
eu-west-2: ami-0c218c2aaa7185f03
15-
eu-west-3: ami-011e0eee21d52f23e
16-
sa-east-1: ami-0d154ae55458941fd
17-
us-east-1: ami-0d130bdfab2037f8a
18-
us-east-2: ami-00d2a10466c577ac7
19-
us-gov-east-1: ami-0f5003922daf22962
20-
us-gov-west-1: ami-ba83fbdb
21-
us-west-1: ami-0b6f7961ee845966e
22-
us-west-2: ami-0d611d90619419e93
2+
ap-east-1: ami-0548157406b20efd7
3+
ap-northeast-1: ami-0266f3876f58f4c10
4+
ap-northeast-2: ami-0b83279e099fee532
5+
ap-south-1: ami-08d877dddd63d4f11
6+
ap-southeast-1: ami-0797836c2582f62b3
7+
ap-southeast-2: ami-097287dbf20f32cd2
8+
ca-central-1: ami-00695df58bfe70532
9+
cn-north-1: ami-0bf01904468ac34e0
10+
cn-northwest-1: ami-07ce7fae883830295
11+
eu-central-1: ami-0ae496b08133b7003
12+
eu-north-1: ami-006772a1e3158c024
13+
eu-west-1: ami-03112372c8bd7886e
14+
eu-west-2: ami-0f2ed960e7413b152
15+
eu-west-3: ami-0feb90a4cf119551f
16+
sa-east-1: ami-02ce797ed2bac903a
17+
us-east-1: ami-0fd18b144da8357b7
18+
us-east-2: ami-0257f6012767b54c9
19+
us-gov-east-1: ami-002752c6ec611554d
20+
us-gov-west-1: ami-527e3e33
21+
us-west-1: ami-03a203cbdfe6ef914
22+
us-west-2: ami-0340aea5e9e9e5202
2323
# centos6
24-
ap-northeast-1: ami-086781b933db101a5
25-
ap-northeast-2: ami-07d646c87d889d816
26-
ap-northeast-3: ami-082ece6e5fe8f6fd1
27-
ap-south-1: ami-02389426198baf430
28-
ap-southeast-1: ami-02105387481bd0ad0
29-
ap-southeast-2: ami-0050fad9761b3957c
30-
ca-central-1: ami-0e70755a47200df23
31-
eu-central-1: ami-03979ebb9cfee2ccc
32-
eu-north-1: ami-085a9ecbf9f64f65b
33-
eu-west-1: ami-070ba56e38a744df5
34-
eu-west-2: ami-08553013e6e986028
35-
eu-west-3: ami-0afff5bc147c847e0
36-
sa-east-1: ami-0635a9bdc378fe67f
37-
us-east-1: ami-091f37e900368fe1a
38-
us-east-2: ami-055404b3df678da86
39-
us-west-1: ami-0e438402399c457d7
40-
us-west-2: ami-0651b7e7cfde4b3a0
24+
ap-east-1: ami-08de06c8c25c4e483
25+
ap-northeast-1: ami-0fb1e620a6e6c7c63
26+
ap-northeast-2: ami-083dda363f440b5f3
27+
ap-south-1: ami-0a19d85caae09e69b
28+
ap-southeast-1: ami-04b147081e72b8141
29+
ap-southeast-2: ami-0cc13227daec10928
30+
ca-central-1: ami-0e584b1dc9d90cfe3
31+
eu-central-1: ami-047fc8e8af243d384
32+
eu-north-1: ami-07413aa597232ff9e
33+
eu-west-1: ami-0ef1bf4281c1c4604
34+
eu-west-2: ami-0097ab9ba306ca46b
35+
eu-west-3: ami-0b941e4e71f296ca7
36+
sa-east-1: ami-0505ed5c5ad56d04b
37+
us-east-1: ami-016392fa0b61bde58
38+
us-east-2: ami-000a7976d7539e448
39+
us-west-1: ami-0a25f5e16aafd09d1
40+
us-west-2: ami-0951110bddb6944b0
4141
# centos7
42-
ap-northeast-1: ami-09bae677f8f58842d
43-
ap-northeast-2: ami-0eeb6c96d0e6c2d90
44-
ap-northeast-3: ami-084c0dbc04f722758
45-
ap-south-1: ami-031f8f67a53de53fe
46-
ap-southeast-1: ami-041ca5c2f5b748966
47-
ap-southeast-2: ami-06c7f5584ecfcac3a
48-
ca-central-1: ami-0afc2ea67b3963398
49-
eu-central-1: ami-0205eaef48a9fc97a
50-
eu-north-1: ami-0420576e18a5fcb7c
51-
eu-west-1: ami-0f67868de5be7b0b3
52-
eu-west-2: ami-057fa1a5314e3c414
53-
eu-west-3: ami-05b2808c2dc4fb82c
54-
sa-east-1: ami-0da1262e3c5d9af72
55-
us-east-1: ami-031eb9c5390c0f8f6
56-
us-east-2: ami-0050bd80a1cecfe37
57-
us-west-1: ami-09bd008b253048b80
58-
us-west-2: ami-003da28849bc413f5
42+
ap-east-1: ami-0b8dbcf754d6b1a15
43+
ap-northeast-1: ami-075158d05e7ffc090
44+
ap-northeast-2: ami-03d158fde32bf5c43
45+
ap-south-1: ami-0256ac397ea3738d9
46+
ap-southeast-1: ami-0768c2e0ebf2b048b
47+
ap-southeast-2: ami-0d4bc69b138616534
48+
ca-central-1: ami-03f172775e62c78ef
49+
eu-central-1: ami-009cd1bd82fcb612e
50+
eu-north-1: ami-018a7d07256217bd0
51+
eu-west-1: ami-0a47a5dc8fee55323
52+
eu-west-2: ami-0697036b2287f0afc
53+
eu-west-3: ami-017a92730499673fd
54+
sa-east-1: ami-00e378be239ed59f4
55+
us-east-1: ami-0a4d7e08ea5178c02
56+
us-east-2: ami-0a3b8f19ab7333a80
57+
us-west-1: ami-0b977098af0dd77e3
58+
us-west-2: ami-0d6c93513ba5d3734
5959
# ubuntu1404
60-
ap-northeast-1: ami-0939e3e1030d4f7d2
61-
ap-northeast-2: ami-0481c6b023e2328b4
62-
ap-northeast-3: ami-0a535e1d0bb7bc502
63-
ap-south-1: ami-000e99acc047832ae
64-
ap-southeast-1: ami-09ca9a6a8fee71ba5
65-
ap-southeast-2: ami-09646cc49a932a37e
66-
ca-central-1: ami-06ac5db73837bc364
67-
cn-north-1: ami-07e16a5709c99f963
68-
cn-northwest-1: ami-05348579489ba3673
69-
eu-central-1: ami-0032889c720d364dc
70-
eu-north-1: ami-0976908358f0bfa01
71-
eu-west-1: ami-0f5c65a609ad3afb4
72-
eu-west-2: ami-08c2d96c2805037e7
73-
eu-west-3: ami-0f6cd6ac9be8f2b32
74-
sa-east-1: ami-0d0da341da4802af9
75-
us-east-1: ami-017bfe181606779d8
76-
us-east-2: ami-043eb896e1bb2b948
77-
us-gov-east-1: ami-060ced48ab370aadf
78-
us-gov-west-1: ami-32f98153
79-
us-west-1: ami-0d48f8a9d5735efde
80-
us-west-2: ami-0169da6ccb6347f50
60+
ap-east-1: ami-0059b45a57f781c19
61+
ap-northeast-1: ami-07531e6831cd3b73e
62+
ap-northeast-2: ami-0ee26fe2d03734d6c
63+
ap-south-1: ami-047d504b1c9897a71
64+
ap-southeast-1: ami-0497ef9ffb2284737
65+
ap-southeast-2: ami-07c920b78a691f3de
66+
ca-central-1: ami-0a25d23a0f47e04df
67+
cn-north-1: ami-0d20db2e290c07266
68+
cn-northwest-1: ami-091c2a6f3a16fe374
69+
eu-central-1: ami-0cd4c9af30dce9c71
70+
eu-north-1: ami-02a5769e4d5f5c4c9
71+
eu-west-1: ami-00aeb9a13213998a0
72+
eu-west-2: ami-0be08f7b993cf51e5
73+
eu-west-3: ami-0dfb94ca6be715d8c
74+
sa-east-1: ami-008eaaf7ed3b81e00
75+
us-east-1: ami-006da8413e239334a
76+
us-east-2: ami-0a1882523d4df3e24
77+
us-gov-east-1: ami-04d31bc443d9b3c36
78+
us-gov-west-1: ami-8b7c3cea
79+
us-west-1: ami-02a1eceaa0c83ee27
80+
us-west-2: ami-0b1e995e9452b4050
8181
# ubuntu1604
82-
ap-northeast-1: ami-06b328a6ee03ccdf4
83-
ap-northeast-2: ami-0179e2707f709f813
84-
ap-northeast-3: ami-0c9b72bae5efc9f61
85-
ap-south-1: ami-0f21d1eb3339ebd6a
86-
ap-southeast-1: ami-01899e9a659eb2267
87-
ap-southeast-2: ami-049c81a79d55b2c8a
88-
ca-central-1: ami-0b8928a1f643684eb
89-
cn-north-1: ami-0ae967dc97d5eb57a
90-
cn-northwest-1: ami-0ba0b1ed49ce7b1b1
91-
eu-central-1: ami-002422c65a5bb1af8
92-
eu-north-1: ami-0d3c7ce730c73ab00
93-
eu-west-1: ami-00328873639859269
94-
eu-west-2: ami-0c1de72c6acf4b187
95-
eu-west-3: ami-090d577bb6d08e95b
96-
sa-east-1: ami-08df8912b098a3f42
97-
us-east-1: ami-08e1d33a6a64499de
98-
us-east-2: ami-0219fdb6f47395d88
99-
us-gov-east-1: ami-0af2c8e5bf3c334b0
100-
us-gov-west-1: ami-7b85fd1a
101-
us-west-1: ami-066818f6a6be06fb5
102-
us-west-2: ami-07122cb5a96b7fee9
82+
ap-east-1: ami-0eacbde87adcd79a4
83+
ap-northeast-1: ami-082d16fe36ad64a5d
84+
ap-northeast-2: ami-0603f6bfdaf0520b9
85+
ap-south-1: ami-0e4af994480d249c6
86+
ap-southeast-1: ami-0e51553a3f083c9d3
87+
ap-southeast-2: ami-0404f148f2106e206
88+
ca-central-1: ami-079729e8f44ea4e33
89+
cn-north-1: ami-0f71072a6c2f049fc
90+
cn-northwest-1: ami-0625b09f99c971e40
91+
eu-central-1: ami-054fd0d64e09a12d5
92+
eu-north-1: ami-08984e346e48bc46f
93+
eu-west-1: ami-0c5c2481e10335e90
94+
eu-west-2: ami-047a75cbcc2756dda
95+
eu-west-3: ami-0b26c8b0857c0722d
96+
sa-east-1: ami-0e654e24368bf23f5
97+
us-east-1: ami-0c535eb8c5a80b962
98+
us-east-2: ami-00fb6e36bb37b662e
99+
us-gov-east-1: ami-0a4c82eb37facd766
100+
us-gov-west-1: ami-777b3b16
101+
us-west-1: ami-0bb311ef404c8a54b
102+
us-west-2: ami-097b7aae68846a39a

cli/pcluster/cfnconfig.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -360,12 +360,6 @@ def __init_vpc_parameters(self):
360360
"VPC section [%s] used in [%s] section is not defined" % (vpc_section, self.__cluster_section)
361361
)
362362

363-
# Check that cidr and public ips are not both set
364-
cidr_value = self.__config.get(vpc_section, "compute_subnet_cidr", fallback=None)
365-
public_ips = self.__config.getboolean(vpc_section, "use_public_ips", fallback=True)
366-
if self.__sanity_check:
367-
ResourceValidator.validate_vpc_coherence(cidr_value, public_ips)
368-
369363
def __check_account_capacity(self):
370364
"""Try to launch the requested number of instances to verify Account limits."""
371365
if self.parameters.get("Scheduler") == "awsbatch" or self.parameters.get("ClusterType", "ondemand") == "spot":
@@ -514,14 +508,12 @@ def __init_cluster_parameters(self):
514508
post_install_args=("PostInstallArgs", None),
515509
s3_read_resource=("S3ReadResource", None),
516510
s3_read_write_resource=("S3ReadWriteResource", None),
517-
tenancy=("Tenancy", None),
518511
master_root_volume_size=("MasterRootVolumeSize", None),
519512
compute_root_volume_size=("ComputeRootVolumeSize", None),
520513
base_os=("BaseOS", None),
521514
ec2_iam_role=("EC2IAMRoleName", "EC2IAMRoleName"),
522515
extra_json=("ExtraJson", None),
523516
custom_chef_cookbook=("CustomChefCookbook", None),
524-
custom_chef_runlist=("CustomChefRunList", None),
525517
additional_cfn_template=("AdditionalCfnTemplate", None),
526518
custom_awsbatch_template_url=("CustomAWSBatchTemplateURL", None),
527519
)

cli/pcluster/cli.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -313,6 +313,13 @@ def _get_parser():
313313
help="Specifies the OS of the base AMI. "
314314
"Valid options are: alinux, ubuntu1404, ubuntu1604, centos6, centos7.",
315315
)
316+
pami.add_argument(
317+
"-i",
318+
"--instance-type",
319+
dest="instance_type",
320+
default="t2.xlarge",
321+
help="Sets instance type to build the ami on. Defaults to t2.xlarge.",
322+
)
316323
pami.add_argument(
317324
"-ap",
318325
"--ami-name-prefix",

cli/pcluster/config_sanity.py

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -46,17 +46,6 @@ def __get_partition(self):
4646
return "aws-us-gov"
4747
return "aws"
4848

49-
@staticmethod
50-
def validate_vpc_coherence(cidr_value, public_ip):
51-
"""
52-
Check that cidr_value and public_ip parameters are not conflicting.
53-
54-
:param cidr_value: the value of compute_subnet_cidr set by the user (default should be None)
55-
:param public_ip: the value of use_public_ips set by the user (default should be True)
56-
"""
57-
if cidr_value and public_ip is False:
58-
ResourceValidator.__fail("VPC COHERENCE", "compute_subnet_cidr needs use_public_ips to be true")
59-
6049
@staticmethod
6150
def __check_sg_rules_for_port(rule, port_to_check):
6251
"""
@@ -355,7 +344,8 @@ def validate(self, resource_type, resource_value): # noqa: C901 FIXME
355344
),
356345
(
357346
["cloudformation:DescribeStacks"],
358-
"arn:%s:cloudformation:%s:%s:stack/parallelcluster-*" % (partition, self.region, account_id),
347+
["cloudformation:DescribeStackResource"],
348+
"arn:%s:cloudformation:%s:%s:stack/parallelcluster-*/*" % (partition, self.region, account_id),
359349
),
360350
(["s3:GetObject"], "arn:%s:s3:::%s-aws-parallelcluster/*" % (partition, self.region)),
361351
(["sqs:ListQueues"], "*"),
@@ -438,7 +428,7 @@ def validate(self, resource_type, resource_value): # noqa: C901 FIXME
438428
self.__fail(resource_type, e.response.get("Error").get("Message"))
439429
# EC2 Placement Group
440430
elif resource_type == "EC2PlacementGroup":
441-
if resource_value == "DYNAMIC":
431+
if resource_value == "DYNAMIC" or resource_value == "NONE":
442432
pass
443433
else:
444434
try:

cli/pcluster/easyconfig.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,8 @@ def configure(args): # noqa: C901 FIXME!!!
228228
# ensure that the directory for the config file exists (because
229229
# ~/.parallelcluster is likely not to exist on first usage)
230230
try:
231-
os.makedirs(os.path.dirname(config_file))
231+
config_folder = os.path.dirname(config_file) or "."
232+
os.makedirs(config_folder)
232233
except OSError as e:
233234
if e.errno != errno.EEXIST:
234235
raise # can safely ignore EEXISTS for this purpose...

0 commit comments

Comments
 (0)