Skip to content

Commit 3d7f870

Browse files
authored
Merge pull request #49602 from dlepow/manytasks
Batch: Submit many tasks
2 parents ee737f9 + 091d025 commit 3d7f870

File tree

2 files changed

+214
-0
lines changed

2 files changed

+214
-0
lines changed

articles/batch/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,8 @@
9393
href: batch-task-dependencies.md
9494
- name: User accounts for running tasks
9595
href: batch-user-accounts.md
96+
- name: Submit a large number of tasks
97+
href: large-number-tasks.md
9698
- name: Persist job and task output
9799
href: batch-task-output.md
98100
items:

articles/batch/large-number-tasks.md

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
---
2+
title: Submit a large number of tasks - Azure Batch | Microsoft Docs
3+
description: How to efficiently submit a very large number of tasks in a single Azure Batch job
4+
services: batch
5+
documentationcenter:
6+
author: dlepow
7+
manager: jeconnoc
8+
editor: ''
9+
10+
ms.assetid:
11+
ms.service: batch
12+
ms.devlang: multiple
13+
ms.topic: article
14+
ms.tgt_pltfrm:
15+
ms.workload: big-compute
16+
ms.date: 08/24/2018
17+
ms.author: danlep
18+
ms.custom:
19+
20+
---
21+
# Submit a large number of tasks to a Batch job
22+
23+
When you run large-scale Azure Batch workloads, you might want to submit tens of thousands, hundreds of thousands, or even more tasks to a single job.
24+
25+
This article gives guidance and some code examples to submit large numbers of tasks with substantially increased throughput to a single Batch job. After tasks are submitted, they enter the Batch queue for processing on the pool you specify for the job.
26+
27+
## Use task collections
28+
29+
The Batch APIs provide methods to efficiently add tasks to a job as a *collection*, in addition to one at a time. When adding a large number of tasks, you should use the appropriate methods or overloads to add tasks as a collection. Generally, you construct a task collection by defining tasks as you iterate over a set of input files or parameters for your job.
30+
31+
The maximum size of the task collection that you can add in a single call depends on the Batch API you use:
32+
33+
* The following Batch APIs limit the collection to **100 tasks**. The limit could be smaller depending on the size of the tasks - for example, if the tasks have a large number of resource files or environment variables.
34+
35+
* [REST API](/rest/api/batchservice/task/addcollection)
36+
* [Python API](/python/api/azure-batch/azure.batch.operations.TaskOperations?view=azure-python#azure_batch_operations_TaskOperations_add_collection)
37+
* [Node.js API](/javascript/api/azure-batch/task?view=azure-node-latest#addcollection)
38+
39+
When using these APIs, you need to provide logic to divide the number of tasks to meet the collection limit, and to handle errors and retries in case addition of tasks fails. If a task collection is too large to add, the request generates an error and should be retried again with fewer tasks.
40+
41+
* The following APIs support much larger task collections - limited only by RAM availability on the submitting client. These APIs transparently handle dividing the task collection into "chunks" for the lower-level APIs and retries if addition of tasks fails.
42+
43+
* [.NET API](/dotnet/api/microsoft.azure.batch.cloudjob.addtaskasync?view=azure-dotnet)
44+
* [Java API](/java/api/com.microsoft.azure.batch.protocol._tasks.addcollectionasync?view=azure-java-stable)
45+
* [Azure Batch CLI extension](batch-cli-templates.md) with Batch CLI templates
46+
* [Python SDK extension](https://pypi.org/project/azure-batch-extensions/)
47+
48+
## Increase throughput of task submission
49+
50+
It can take some time to add a large collection of tasks to a job - for example, up to 1 minute to add 20,000 tasks via the .NET API. Depending on the Batch API and your workload, you can improve the task throughput by modifying one or more of the following:
51+
52+
* **Task size** - Adding large tasks takes longer than adding smaller ones. To reduce the size of each task in a collection, you can simplify the task command line, reduce the number of environment variables, or handle requirements for task execution more efficiently. For example, instead of using a large number of resource files, install task dependencies using a [start task](batch-api-basics.md#start-task) on the pool or use an [application package](batch-application-packages.md) or [Docker container](batch-docker-container-workloads.md).
53+
54+
* **Number of parallel operations** - Depending on the Batch API, increase throughput by increasing the maximum number of concurrent operations by the Batch client. Configure this setting using the [BatchClientParallelOptions.MaxDegreeOfParallelism](/dotnet/api/microsoft.azure.batch.batchclientparalleloptions.maxdegreeofparallelism) property in the .NET API, or the `threads` parameter of methods such as [TaskOperations.add_collection](/python/api/azure-batch/azure.batch.operations.TaskOperations?view=azure-python#add-collection) in the Batch Python SDK extension. (This property is not available in the native Batch Python SDK.) By default, this property is set to 1, but set it higher to improve throughput of operations. You trade off increased throughput by consuming network bandwidth and some CPU performance. Task throughput increases by up to 100 times the `MaxDegreeOfParallelism` or `threads`. In practice, you should set the number of concurrent operations below 100.
55+
56+
The Azure Batch CLI extension with Batch templates increases the number of concurrent operations automatically based on the number of available cores, but this property is not configurable in the CLI.
57+
58+
* **HTTP connection limits** - The number of concurrent HTTP connections can throttle the performance of the Batch client when it is adding large numbers of tasks. The number of HTTP connections is limited with certain APIs. When developing with the .NET API, for example, the [ServicePointManager.DefaultConnectionLimit](/dotnet/api/system.net.servicepointmanager.defaultconnectionlimit) property is set to 2 by default. We recommend that you increase the value to a number close to or greater than the number of parallel operations.
59+
60+
## Example: Batch .NET
61+
62+
The following C# snippets show settings to configure when adding a large number of tasks using the Batch .NET API.
63+
64+
To increase task throughput, increase the value of the [MaxDegreeofParallelism](/dotnet/api/microsoft.azure.batch.batchclientparalleloptions.maxdegreeofparallelism) property of the [BatchClient](/dotnet/api/microsoft.azure.batch.batchclient?view=azure-dotnet). For example:
65+
66+
```csharp
67+
BatchClientParallelOptions parallelOptions = new BatchClientParallelOptions()
68+
{
69+
MaxDegreeOfParallelism = 15
70+
};
71+
...
72+
```
73+
Add a task collection to the job using the appropriate overload of the [AddTaskAsync](/dotnet/api/microsoft.azure.batch.cloudjob.addtaskasync?view=azure-dotnet) or [AddTask](/dotnet/api/microsoft.azure.batch.cloudjob.addtask?view=azure-dotnet
74+
) method. For example:
75+
76+
```csharp
77+
// Add a list of tasks as a collection
78+
List<CloudTask> tasksToAdd = new List<CloudTask>(); // Populate with your tasks
79+
...
80+
await batchClient.JobOperations.AddTaskAsync(jobId, tasksToAdd, parallelOptions);
81+
```
82+
83+
84+
## Example: Batch CLI extension
85+
86+
Using the Azure Batch CLI extensions with [Batch CLI templates](batch-cli-templates.md), create a job template JSON file that includes a [task factory](https://github.com/Azure/azure-batch-cli-extensions/blob/master/doc/taskFactories.md). The task factory configures a collection of related tasks for a job from a single task definition.
87+
88+
The following is a sample job template for a one-dimensional parametric sweep job with a large number of tasks - in this case, 250,000. The task command line is a simple `echo` command.
89+
90+
```json
91+
{
92+
"job": {
93+
"type": "Microsoft.Batch/batchAccounts/jobs",
94+
"apiVersion": "2016-12-01",
95+
"properties": {
96+
"id": "myjob",
97+
"constraints": {
98+
"maxWallClockTime": "PT5H",
99+
"maxTaskRetryCount": 1
100+
},
101+
"poolInfo": {
102+
"poolId": "mypool"
103+
},
104+
"taskFactory": {
105+
"type": "parametricSweep",
106+
"parameterSets": [
107+
{
108+
"start": 1,
109+
"end": 250000,
110+
"step": 1
111+
}
112+
],
113+
"repeatTask": {
114+
"commandLine": "/bin/bash -c 'echo Hello world from task {0}'",
115+
"constraints": {
116+
"retentionTime":"PT1H"
117+
}
118+
}
119+
},
120+
"onAllTasksComplete": "terminatejob"
121+
}
122+
}
123+
}
124+
```
125+
To run a job with the template, see [Use Azure Batch CLI templates and file transfer](batch-cli-templates.md).
126+
127+
## Example: Batch Python SDK extension
128+
129+
To use the Azure Batch Python SDK extension, first install the Python SDK and the extension:
130+
131+
```
132+
pip install azure-batch
133+
pip install azure-batch-extensions
134+
```
135+
136+
Set up a `BatchExtensionsClient` that uses the SDK extension:
137+
138+
```python
139+
140+
client = batch.BatchExtensionsClient(base_url=BATCH_ACCOUNT_URL, resource_group=RESOURCE_GROUP_NAME, batch_account=BATCH_ACCOUNT_NAME)
141+
...
142+
```
143+
144+
Create a collection of tasks to add to a job. For example:
145+
146+
147+
```python
148+
tasks=list()
149+
# Populate the list with your tasks
150+
...
151+
152+
```
153+
154+
Add the task collection using [task.add_collection](/python/api/azure-batch/azure.batch.operations.TaskOperations?view=azure-python#add-collection). Set the `threads` parameter to increase the number of concurrent operations:
155+
156+
```python
157+
try:
158+
client.task.add_collection(job_id, threads=100)
159+
except Exception as e:
160+
raise e
161+
```
162+
163+
The Batch Python SDK extension also supports adding task parameters to job using a JSON specification for a task factory. For example, configure job parameters for a parametric sweep similar to the one in the preceding [Batch CLI template](#example-batch-cli-template) example:
164+
165+
```python
166+
parameter_sweep = {
167+
"job": {
168+
"type": "Microsoft.Batch/batchAccounts/jobs",
169+
"apiVersion": "2016-12-01",
170+
"properties": {
171+
"id": "myjob",
172+
"poolInfo": {
173+
"poolId": "mypool"
174+
},
175+
"taskFactory": {
176+
"type": "parametricSweep",
177+
"parameterSets": [
178+
{
179+
"start": 1,
180+
"end": 250000,
181+
"step": 1
182+
}
183+
],
184+
"repeatTask": {
185+
"commandLine": "/bin/bash -c 'echo Hello world from task {0}'",
186+
"constraints": {
187+
"retentionTime":"PT1H"
188+
}
189+
}
190+
},
191+
"onAllTasksComplete": "terminatejob"
192+
}
193+
}
194+
}
195+
...
196+
job_json = client.job.expand_template(parameter_sweep)
197+
job_parameter = client.job.jobparameter_from_json(job_json)
198+
```
199+
200+
Add the job parameters to the job. Set the `threads` parameter to increase the number of concurrent operations:
201+
202+
```python
203+
try:
204+
client.job.add(job_parameter, threads=50)
205+
except Exception as e:
206+
raise e
207+
```
208+
209+
## Next steps
210+
211+
* Learn more about using the Azure Batch CLI extension with [Batch CLI templates](batch-cli-templates.md).
212+
* Learn more about the [Batch Python SDK extension](https://pypi.org/project/azure-batch-extensions/).

0 commit comments

Comments
 (0)