|
4 | 4 | AWS ParallelCluster Auto Scaling |
5 | 5 | ================================ |
6 | 6 |
|
7 | | -.. image:: images/as-basic-diagram.png |
| 7 | +The auto scaling strategy described here applies to HPC clusters deployed with one of the |
| 8 | +supported traditional job scheduler, either SGE, Slurm or Torque. |
| 9 | +In these cases AWS ParallelCluster directly implements the scaling capabilities by managing |
| 10 | +the `Auto Scaling Group`_ (ASG) of the compute nodes and changing the scheduler configuration |
| 11 | +accordingly. |
| 12 | +For HPC clusters based on AWS Batch, ParallelCluster relies on the elastic scaling capabilities |
| 13 | +provided by the AWS-managed job scheduler. |
8 | 14 |
|
9 | 15 | Clusters deployed with AWS ParallelCluster are elastic in several ways. The first is by |
10 | 16 | simply setting the ``initial_queue_size`` and ``max_queue_size`` parameters of a cluster |
11 | | -settings. The ``initial_queue_size`` sets minimum size value of the ComputeFleet |
12 | | -`Auto Scaling Group`_ (ASG) and also the desired capacity value . The ``max_queue_size`` |
13 | | -sets maximum size value of the ComputeFleet ASG. |
| 17 | +settings. The ``initial_queue_size`` sets the minimum size value of the ComputeFleet ASG and also |
| 18 | +the desired capacity value. |
| 19 | +The ``max_queue_size`` sets the maximum size value of the ComputeFleet ASG. |
| 20 | + |
| 21 | +.. image:: images/as-basic-diagram.png |
14 | 22 |
|
15 | 23 | Scaling Up |
16 | 24 | ========== |
17 | 25 |
|
18 | 26 | Every minute, a process called jobwatcher_ runs on the master instance and evaluates |
19 | | -the current number of instances requested in the queue. If this number is greater than the |
20 | | -current autoscaling desired, it adds more instances. If you submit more jobs, |
21 | | -the queue will get re-evaluated and the ASG updated up to the ``max_queue_size``. |
| 27 | +the current number of instances required by the pending jobs in the queue. |
| 28 | +If the total number of busy nodes and requested nodes is greater than the current desired value in the ASG, |
| 29 | +it adds more instances. |
| 30 | +If you submit more jobs, the queue will get re-evaluated and the ASG updated up to the ``max_queue_size``. |
| 31 | + |
| 32 | +With SGE each job requires a number of slots to run (one slot corresponds to one processing unit, e.g. a vCPU). |
| 33 | +When evaluating the number of instances required to serve the currently pending jobs, the jobwatcher |
| 34 | +divides the total number of requested slots by the capacity of a single compute node. |
| 35 | +The capacity of a compute node that is the number of available vcpu depends on the EC2 instance type selected |
| 36 | +in the cluster configuration. |
| 37 | + |
| 38 | +With Slurm and Torque schedulers each job can require both a number of nodes and a number of slots per node. |
| 39 | +The jobwatcher takes into account the request of each job and determines the number of compute nodes to fulfill |
| 40 | +the new computational requirements. |
| 41 | +For example, assuming a cluster with c5.2xlarge (8 vCPU) as compute instance type, and three queued pending jobs |
| 42 | +with the following requirements: job1 2 nodes / 4 slots each, job2 3 nodes / 2 slots and job3 1 node / 4 slots, |
| 43 | +the jobwatcher will require three new compute instances to the ASG to serve the three jobs. |
| 44 | + |
| 45 | +*Current limitation*: the auto scale up logic does not consider partially loaded busy nodes, i.e. each node running |
| 46 | +a job is considered busy even if there are empty slots. |
22 | 47 |
|
23 | 48 | Scaling Down |
24 | 49 | ============ |
25 | 50 |
|
26 | | -On each compute node, a process called nodewatcher_ runs and evaluates the |
27 | | -work left in the queue. If an instance has had no jobs for longer than ``scaledown_idletime`` |
28 | | -(which defaults to 10 minutes), the instance is terminated. |
| 51 | +On each compute node, a process called nodewatcher_ runs and evaluates the idle time of |
| 52 | +the node. If an instance had no jobs for longer than ``scaledown_idletime`` |
| 53 | +(which defaults to 10 minutes) and currently there are no pending jobs in the cluster, |
| 54 | +the instance is terminated. |
29 | 55 |
|
30 | 56 | It specifically calls the TerminateInstanceInAutoScalingGroup_ API call, |
31 | 57 | which will remove an instance as long as the size of the ASG is at least the |
|
0 commit comments