Skip to content

Commit 0e310a6

Browse files
authored
Update per feedback
1 parent 6402f01 commit 0e310a6

File tree

1 file changed

+9
-8
lines changed

1 file changed

+9
-8
lines changed

articles/batch/batch-automatic-scaling.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.service: batch
1212
ms.topic: article
1313
ms.tgt_pltfrm:
1414
ms.workload: multiple
15-
ms.date: 10/04/2019
15+
ms.date: 10/08/2019
1616
ms.author: lahugh
1717
ms.custom: H1Hack27Feb2017
1818

@@ -67,6 +67,7 @@ maxNumberofVMs = 25;
6767
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(180 * TimeInterval_Second);
6868
pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs : avg($PendingTasks.GetSample(180 * TimeInterval_Second));
6969
$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);
70+
$NodeDeallocationOption = taskcompletion;
7071
```
7172

7273
With this autoscale formula, the pool is initially created with a single VM. The `$PendingTasks` metric defines the number of tasks that are running or queued. The formula finds the average number of pending tasks in the last 180 seconds and sets the `$TargetDedicatedNodes` variable accordingly. The formula ensures that the target number of dedicated nodes never exceeds 25 VMs. As new tasks are submitted, the pool automatically grows. As tasks complete, VMs become free one by one and the autoscaling formula shrinks the pool.
@@ -79,6 +80,7 @@ This formula scales dedicated nodes, but can be modified to apply to scale low-p
7980
maxNumberofVMs = 25;
8081
$TargetDedicatedNodes = min(maxNumberofVMs, $PreemptedNodeCount.GetSample(180 * TimeInterval_Second));
8182
$TargetLowPriorityNodes = min(maxNumberofVMs , maxNumberofVMs - $TargetDedicatedNodes);
83+
$NodeDeallocationOption = taskcompletion;
8284
```
8385

8486
This example creates a pool that starts with 25 low-priority nodes. Every time a low-priority node is preempted, it is replaced with a dedicated node. As with the first example, the `maxNumberofVMs` variable prevents the pool from exceeding 25 VMs. This example is useful for taking advantage of low-priority VMs while also ensuring that only a fixed number of preemptions will occur for the lifetime of the pool.
@@ -100,7 +102,7 @@ You can get and set the values of these service-defined variables to manage the
100102
| --- | --- |
101103
| $TargetDedicatedNodes |The target number of dedicated compute nodes for the pool. The number of dedicated nodes is specified as a target because a pool may not always achieve the desired number of nodes. For example, if the target number of dedicated nodes is modified by an autoscale evaluation before the pool has reached the initial target, then the pool may not reach the target. <br /><br /> A pool in an account created with the Batch Service configuration may not achieve its target if the target exceeds a Batch account node or core quota. A pool in an account created with the User Subscription configuration may not achieve its target if the target exceeds the shared core quota for the subscription.|
102104
| $TargetLowPriorityNodes |The target number of low-priority compute nodes for the pool. The number of low-priority nodes is specified as a target because a pool may not always achieve the desired number of nodes. For example, if the target number of low-priority nodes is modified by an autoscale evaluation before the pool has reached the initial target, then the pool may not reach the target. A pool may also not achieve its target if the target exceeds a Batch account node or core quota. <br /><br /> For more information on low-priority compute nodes, see [Use low-priority VMs with Batch (Preview)](batch-low-pri-vms.md). |
103-
| $NodeDeallocationOption |The action that occurs when compute nodes are removed from a pool. Possible values are:<ul><li>**requeue**-- The default value. Terminates tasks immediately and puts them back on the job queue so that they are rescheduled.<li>**terminate**--Terminates tasks immediately and removes them from the job queue.<li>**taskcompletion**--Waits for currently running tasks to finish and then removes the node from the pool. This is the **recommended value** in most cases. <li>**retaineddata**--Waits for all the local task-retained data on the node to be cleaned up before removing the node from the pool.</ul> |
105+
| $NodeDeallocationOption |The action that occurs when compute nodes are removed from a pool. Possible values are:<ul><li>**requeue**-- The default value. Terminates tasks immediately and puts them back on the job queue so that they are rescheduled. This action ensures the target number of nodes is reach as quickly as possible, but may be less efficient, as any running tasks will be interrupted and have to be restarted, wasting any work they had already done. <li>**terminate**--Terminates tasks immediately and removes them from the job queue.<li>**taskcompletion**--Waits for currently running tasks to finish and then removes the node from the pool. Use this option to avoid tasks being interrupted and requeued, wasting any work the task has done. <li>**retaineddata**--Waits for all the local task-retained data on the node to be cleaned up before removing the node from the pool.</ul> |
104106

105107
You can get the value of these service-defined variables to make adjustments that are based on metrics from the Batch service:
106108

@@ -331,8 +333,9 @@ You build an autoscale formula by forming statements that use the above componen
331333
First, let's define the requirements for our new autoscale formula. The formula should:
332334

333335
1. Increase the target number of dedicated compute nodes in a pool if CPU usage is high.
334-
2. Decrease the target number of dedicated compute nodes in a pool when CPU usage is low.
335-
3. Always restrict the maximum number of dedicated nodes to 400.
336+
1. Decrease the target number of dedicated compute nodes in a pool when CPU usage is low.
337+
1. Always restrict the maximum number of dedicated nodes to 400.
338+
1. When reducing the number of nodes, do not remove nodes that are running tasks; if necessary, wait until tasks have finished to remove nodes.
336339

337340
To increase the number of nodes during high CPU usage, define the statement that populates a user-defined variable (`$totalDedicatedNodes`) with a value that is 110 percent of the current target number of dedicated nodes, but only if the minimum average CPU usage during the last 10 minutes was above 70 percent. Otherwise, use the value for the current number of dedicated nodes.
338341

@@ -654,6 +657,7 @@ $workHours = $curTime.hour >= 8 && $curTime.hour < 18;
654657
$isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5;
655658
$isWorkingWeekdayHour = $workHours && $isWeekday;
656659
$TargetDedicatedNodes = $isWorkingWeekdayHour ? 20:10;
660+
$NodeDeallocationOption = taskcompletion;
657661
```
658662

659663
### Example 2: Task-based adjustment
@@ -672,13 +676,10 @@ $targetVMs = $tasks > 0? $tasks:max(0, $TargetDedicatedNodes/2);
672676
// The pool size is capped at 20, if target VM value is more than that, set it
673677
// to 20. This value should be adjusted according to your use case.
674678
$TargetDedicatedNodes = max(0, min($targetVMs, 20));
675-
// Set node deallocation mode - keep nodes active only until tasks finish
679+
// Set node deallocation mode - let running tasks finish before removing a node
676680
$NodeDeallocationOption = taskcompletion;
677681
```
678682

679-
> [!NOTE]
680-
> It's highly recommended to use `$NodeDeallocationOption = taskcompletion`, which waits for currently running tasks to finish, and then removes the node from the pool. If you use the default value, `$NodeDeallocationOption = requeue`, the task terminates immediately, wasting what has been run so far, and puts nodes back on the job queue so that they are rescheduled. SFor more information, see the [Variables](#variables) table.
681-
682683
### Example 3: Accounting for parallel tasks
683684

684685
This example adjusts the pool size based on the number of tasks. This formula also takes into account the [MaxTasksPerComputeNode][net_maxtasks] value that has been set for the pool. This approach is useful in situations where [parallel task execution](batch-parallel-node-tasks.md) has been enabled on your pool.

0 commit comments

Comments
 (0)