You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Quickstart: Use Python API to run an Azure Batch job
11
11
12
-
Get started with Azure Batch by using the Python API to run an Azure Batch job from an app. The app uploads input data files to Azure Storage and creates a pool of Batch compute nodes (virtual machines). It then creates a job that runs tasks to process each input file in the pool using a basic command.
12
+
This quickstart shows you how to get started with Azure Batch by using the [Azure Batch libraries for Python](/python/api/overview/azure/batch). The quickstart uses a Python app to do the following actions:
13
13
14
-
After completing this quickstart, you'll understand key concepts of the Batch service and be ready to try Batch with more realistic workloads at larger scale.
14
+
> [!div class="checklist"]
15
+
> - Upload three text files to a blob container in Azure Storage as inputs for Batch task processing.
16
+
> - Create a *pool* of two compute *nodes* running Ubuntu 20.04 LTS OS.
17
+
> - Create a *job* and three *tasks* to run on the nodes. Each task processes one of the input files by using a Bash shell command line.
18
+
> - Display the output files that the tasks return.
15
19
16
-

20
+
After you complete this quickstart, you understand the [key concepts of the Batch service](batch-service-workflow-features.md) and are ready to use Batch with more realistic, larger scale workloads.
17
21
18
22
## Prerequisites
19
23
20
-
- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F).
24
+
- An Azure account with an active subscription. If you don't have one, [create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F).
21
25
22
-
- A Batch account and a linked Azure Storage account. To create these accounts, see the Batch quickstarts using the[Azure portal](quick-create-portal.md)or [Azure CLI](quick-create-cli.md).
26
+
- A Batch account with a linked Azure Storage account. You can create the accounts by using any of the following methods:<br>[Azure CLI](quick-create-cli.md) |[Azure portal](quick-create-portal.md)| [Bicep](quick-create-bicep.md) | [ARM template](quick-create-template.md) | [Terraform](quick-create-terraform.md)
23
27
24
-
-[Python](https://python.org/downloads) version 3.6 or later, including the [pip](https://pip.pypa.io/en/stable/installing/) package manager.
28
+
-[Python](https://python.org/downloads) version 3.6 or later, which includes the [pip](https://pip.pypa.io/en/stable/installing) package manager.
25
29
26
-
## Sign in to Azure
30
+
## Run the app
27
31
28
-
Sign in to the Azure portal at [https://portal.azure.com](https://portal.azure.com).
32
+
To run the Python app, you need to provide your Batch and Storage account names, account keys, and Batch account endpoint. You can get this information from the Azure portal, Azure APIs, or command-line tools. To get these values from the Azure portal:
33
+
34
+
1. From the Azure search bar, search for and select your Batch account name.
35
+
1. On your Batch account page, select **Keys** from the left navigation.
36
+
1. On the **Keys** page, copy the following values:
1. Download or clone the [Azure Batch Python Quickstart](https://github.com/Azure-Samples/batch-python-quickstart) app from GitHub. Use the following command to clone the app repo with a Git client:
33
47
34
-
[Download or clone the sample app](https://github.com/Azure-Samples/batch-python-quickstart) from GitHub. To clone the sample app repo with a Git client, use the following command:
1. Run the script to see the Batch workflow in action.
57
69
58
-
## Run the app
70
+
```bash
71
+
python python_quickstart_client.py
72
+
```
59
73
60
-
To see the Batch workflow in action, run the script:
74
+
Typical execution time is approximately three minutes. Initial pool node setup takes the most time.
61
75
62
-
```bash
63
-
python python_quickstart_client.py
64
-
```
76
+
### App output
65
77
66
-
After running the script, review the code to learn what each part of the application does.
78
+
During app execution, there's a pause at `Monitoring all tasks for 'Completed' state, timeout in 00:30:00...` while the pool's compute nodes are started. Tasks are queued to run as soon as the first compute node is running. You can monitor node, task, and job status from your Batch account page in the Azure portal.
67
79
68
-
When you run the sample application, the console output is similar to the following. During execution, you experience a pause at `Monitoring all tasks for 'Completed' state, timeout in 00:30:00...` while the pool's compute nodes are started. Tasks are queued to run as soon as the first compute node is running. Go to your Batch account in the [Azure portal](https://portal.azure.com) to monitor the pool, compute nodes, job, and tasks in your Batch account.
80
+
The sample application returns output similar to the following example:
69
81
70
82
```output
71
-
Sample start: 11/26/2018 4:02:54 PM
83
+
Sample start: 11/26/2012 4:02:54 PM
72
84
73
85
Uploading file taskdata0.txt to container [input]...
74
86
Uploading file taskdata1.txt to container [input]...
@@ -79,7 +91,7 @@ Adding 3 tasks to job [PythonQuickstartJob]...
79
91
Monitoring all tasks for 'Completed' state, timeout in 00:30:00...
80
92
```
81
93
82
-
After tasks complete, you see output similar to the following for each task:
94
+
After each task completes, you see output similar to the following example:
83
95
84
96
```output
85
97
Printing task output...
@@ -90,58 +102,49 @@ Batch processing began with mainframe computers and punch cards. Today it still
90
102
...
91
103
```
92
104
93
-
Typical execution time is approximately 3 minutes when you run the application in its default configuration. Initial pool setup takes the most time.
94
-
95
105
## Review the code
96
106
97
-
The Python app in this quickstart does the following:
98
-
99
-
- Uploads three small text files to a blob container in your Azure storage account. These files are inputs for processing by Batch tasks.
100
-
- Creates a pool of two compute nodes running Ubuntu 20.04 LTS.
101
-
- Creates a job and three tasks to run on the nodes. Each task processes one of the input files using a Bash shell command line.
102
-
- Displays files returned by the tasks.
103
-
104
-
See the file `python_quickstart_client.py` and the following sections for details.
107
+
The Python app in this quickstart takes the following steps:
105
108
106
-
### Preliminaries
109
+
### Upload resource files
107
110
108
-
To interact with a storage account, the app creates a [BlobServiceClient](/python/api/azure-storage-blob/azure.storage.blob.blobserviceclient) object.
111
+
1.To interact with the Storage account, the app creates a [BlobServiceClient](/python/api/azure-storage-blob/azure.storage.blob.blobserviceclient) object.
The app uses the `blob_service_client` reference to create a container in the storage account and to upload data files to the container. The files in storage are defined as Batch [ResourceFile](/python/api/azure-batch/azure.batch.models.resourcefile) objects that Batch can later download to compute nodes.
1. The app uses the `blob_service_client` reference to create a container in the Storage account and upload data files to the container. The files in storage are defined as Batch [ResourceFile](/python/api/azure-batch/azure.batch.models.resourcefile) objects that Batch can later download to compute nodes.
128
121
129
-
The app creates a [BatchServiceClient](/python/api/azure.batch.batchserviceclient) object to create and manage pools, jobs, and tasks in the Batch service. The Batch client in the sample uses shared key authentication. Batch also supports Azure Active Directory authentication.
1. The app creates a [BatchServiceClient](/python/api/azure.batch.batchserviceclient) object to create and manage pools, jobs, and tasks in the Batch account. The Batch client uses shared key authentication. Batch also supports Azure Active Directory (Azure AD) authentication.
To create a Batch pool, the app uses the [PoolAddParameter](/python/api/azure-batch/azure.batch.models.pooladdparameter) class to set the number of nodes, VM size, and a pool configuration. Here, a[VirtualMachineConfiguration](/python/api/azure-batch/azure.batch.models.virtualmachineconfiguration) object specifies an [ImageReference](/python/api/azure-batch/azure.batch.models.imagereference) to an Ubuntu Server 20.04 LTS image published in the Azure Marketplace. Batch supports a wide range of Linux and Windows Server images in the Azure Marketplace, as well as custom VM images.
145
+
To create a Batch pool, the app uses the [PoolAddParameter](/python/api/azure-batch/azure.batch.models.pooladdparameter) class to set the number of nodes, virtual machine (VM) size, and pool configuration. The following[VirtualMachineConfiguration](/python/api/azure-batch/azure.batch.models.virtualmachineconfiguration) object specifies an [ImageReference](/python/api/azure-batch/azure.batch.models.imagereference) to an Ubuntu Server 20.04 LTS Azure Marketplace image. Batch supports a wide range of Linux and Windows Server Marketplace images as well as custom VM images.
143
146
144
-
The number of nodes (`POOL_NODE_COUNT`) and VM size (`POOL_VM_SIZE`) are defined constants. The sample by default creates a pool of 2 size *Standard_DS1_v2* nodes. The size suggested offers a good balance of performance versus cost for this quick example.
147
+
The number of nodes (`POOL_NODE_COUNT`) and VM size (`POOL_VM_SIZE`) are defined constants. The app by default creates a pool of two size *Standard_DS1_v2* nodes. This size offers a good balance of performance versus cost for this quickstart.
145
148
146
149
The [pool.add](/python/api/azure-batch/azure.batch.operations.pooloperations) method submits the pool to the Batch service.
A Batch job is a logical grouping of one or more tasks. A job includes settings common to the tasks, such as priority and the pool to run tasks on. The app uses the [JobAddParameter](/python/api/azure-batch/azure.batch.models.jobaddparameter) class to create a job on your pool. The [job.add](/python/api/azure-batch/azure.batch.operations.joboperations) method adds a job to the specified Batch account. Initially the job has no tasks.
170
+
A Batch job is a logical grouping of one or more tasks. A job includes settings common to the tasks, such as priority and the pool to run tasks on.
171
+
172
+
The app uses the [JobAddParameter](/python/api/azure-batch/azure.batch.models.jobaddparameter) class to create a job on the pool. The [job.add](/python/api/azure-batch/azure.batch.operations.joboperations) method adds a job to the specified Batch account. Initially the job has no tasks.
The app creates a list of task objects using the [TaskAddParameter](/python/api/azure-batch/azure.batch.models.taskaddparameter) class. Each task processes an input `resource_files` object using a `command_line` parameter. In the sample, the command line runs the Bash shell `cat` command to display the text file. This command is a simple example for demonstration purposes. When you use Batch, the command line is where you specify your app or script. Batch provides a number of ways to deploy apps and scripts to compute nodes.
184
+
The app creates a list of task objects by using the [TaskAddParameter](/python/api/azure-batch/azure.batch.models.taskaddparameter) class. Each task uses a `command_line` parameter to process an input `resource_files` object. The command line is where you specify your app or script. Batch provides a number of ways to deploy apps and scripts to compute nodes.
180
185
181
-
Then, the app adds tasks to the job with the [task.add_collection](/python/api/azure-batch/azure.batch.operations.taskoperations) method, which queues them to run on the compute nodes.
186
+
In the current example, the command line runs the Bash shell `cat` command to display the text file. Then, the app adds tasks to the job with the [task.add_collection](/python/api/azure-batch/azure.batch.operations.taskoperations) method, which queues the tasks to run on the compute nodes.
The app monitors task state to make sure the tasks complete. Then, the app displays the `stdout.txt` file generated by each completed task. When the task runs successfully, the output of the task command is written to `stdout.txt`:
205
+
The app monitors task state to make sure the tasks complete. When the task runs successfully, the output of the task command writes to the *stdout.txt* file. The app then displays the *stdout.txt* file each completed task generates.
201
206
202
207
```python
203
208
tasks = batch_service_client.task.list(job_id)
@@ -233,7 +238,7 @@ When no longer needed, delete the resource group, Batch account, and storage acc
233
238
234
239
## Next steps
235
240
236
-
In this quickstart, you ran a small app built using the Batch Python API to create a Batch pool and a Batch job. The job ran sample tasks, and downloaded output created on the nodes. Now that you understand the key concepts of the Batch service, you are ready to try Batch with more realistic workloads at largerscale. To learn more about Azure Batch, and walk through a parallel workload with a real-world application, continue to the Batch Python tutorial.
241
+
In this quickstart, you ran an app that uses the Batch Python API to create a Batch pool, nodes, job, and tasks. The job uploaded resource files to an storage container, ran tasks on the nodes, and downloaded output created on the nodes. Now that you understand the key concepts of the Batch service, you're ready to use Batch with more realistic, larger-scale workloads. To learn more about Azure Batch and walk through a parallel workload with a real-world application, continue to the Batch Python tutorial.
237
242
238
243
> [!div class="nextstepaction"]
239
244
> [Process a parallel workload with Python](tutorial-parallel-python.md)
0 commit comments