Skip to content

Commit 033d884

Browse files
authored
Merge pull request #58008 from laurenhughes/lahugh-outputfiles
Output file article cleanup
2 parents 2f1e02a + da75631 commit 033d884

File tree

3 files changed

+53
-92
lines changed

3 files changed

+53
-92
lines changed

articles/batch/batch-task-output-file-conventions.md

Lines changed: 13 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -13,23 +13,21 @@ ms.devlang: multiple
1313
ms.topic: article
1414
ms.tgt_pltfrm:
1515
ms.workload: big-compute
16-
ms.date: 06/16/2017
16+
ms.date: 11/14/2018
1717
ms.author: danlep
1818
ms.custom: H1Hack27Feb2017
1919

2020
---
21-
# Persist job and task data to Azure Storage with the Batch File Conventions library for .NET
21+
# Persist job and task data to Azure Storage with the Batch File Conventions library for .NET
2222

2323
[!INCLUDE [batch-task-output-include](../../includes/batch-task-output-include.md)]
2424

25-
One way to persist task data is to use the [Azure Batch File Conventions library for .NET][nuget_package]. The File Conventions library simplifies the process of storing task output data to Azure Storage and retrieving it. You can use the File Conventions library in both task and client code — in task code for persisting files, and in client code to list and retrieve them. Your task code can also use the library to retrieve the output of upstream tasks, such as in a [task dependencies](batch-task-dependencies.md) scenario.
25+
One way to persist task data is to use the [Azure Batch File Conventions library for .NET][nuget_package]. The File Conventions library simplifies the process of storing task output data to Azure Storage and retrieving it. You can use the File Conventions library in both task and client code — in task code for persisting files, and in client code to list and retrieve them. Your task code can also use the library to retrieve the output of upstream tasks, such as in a [task dependencies](batch-task-dependencies.md) scenario.
2626

2727
To retrieve output files with the File Conventions library, you can locate the files for a given job or task by listing them by ID and purpose. You don't need to know the names or locations of the files. For example, you can use the File Conventions library to list all intermediate files for a given task, or get a preview file for a given job.
2828

2929
> [!TIP]
3030
> Starting with version 2017-05-01, the Batch service API supports persisting output data to Azure Storage for tasks and job manager tasks that run on pools created with the virtual machine configuration. The Batch service API provides a simple way to persist output from within the code that creates a task and serves as an alternative to the File Conventions library. You can modify your Batch client applications to persist output without needing to update the application that your task is running. For more information, see [Persist task data to Azure Storage with the Batch service API](batch-task-output-files.md).
31-
>
32-
>
3331
3432
## When do I use the File Conventions library to persist task output?
3533

@@ -38,27 +36,27 @@ Azure Batch provides more than one way to persist task output. The File Conventi
3836
- You can easily modify the code for the application that your task is running to persist files using the File Conventions library.
3937
- You want to stream data to Azure Storage while the task is still running.
4038
- You want to persist data from pools created with either the cloud service configuration or the virtual machine configuration.
41-
- Your client application or other tasks in the job needs to locate and download task output files by ID or by purpose.
39+
- Your client application or other tasks in the job needs to locate and download task output files by ID or by purpose.
4240
- You want to view task output in the Azure portal.
4341

44-
If your scenario differs from those listed above, you may need to consider a different approach. For more information on other options for persisting task output, see [Persist job and task output to Azure Storage](batch-task-output.md).
42+
If your scenario differs from those listed above, you may need to consider a different approach. For more information on other options for persisting task output, see [Persist job and task output to Azure Storage](batch-task-output.md).
4543

4644
## What is the Batch File Conventions standard?
4745

4846
The [Batch File Conventions standard](https://github.com/Azure/azure-sdk-for-net/tree/psSdkJson6/src/SDKs/Batch/Support/FileConventions#conventions) provides a naming scheme for the destination containers and blob paths to which your output files are written. Files persisted to Azure Storage that adhere to the File Conventions standard are automatically available for viewing in the Azure portal. The portal is aware of the naming convention and so can display files that adhere to it.
4947

50-
The File Conventions library for .NET automatically names your storage containers and task output files according to the File Conventions standard. The File Conventions library also provides methods to query output files in Azure Storage according to job ID, task ID, or purpose.
48+
The File Conventions library for .NET automatically names your storage containers and task output files according to the File Conventions standard. The File Conventions library also provides methods to query output files in Azure Storage according to job ID, task ID, or purpose.
5149

52-
If you are developing with a language other than .NET, you can implement the File Conventions standard yourself in your application. For more information, see [About the Batch File Conventions standard](batch-task-output.md#about-the-batch-file-conventions-standard).
50+
If you are developing with a language other than .NET, you can implement the File Conventions standard yourself in your application. For more information, see [Implement the Batch File Conventions standard](batch-task-output.md#implement-the-batch-file-conventions-standard).
5351

5452
## Link an Azure Storage account to your Batch account
5553

5654
To persist output data to Azure Storage using the File Conventions library, you must first link an Azure Storage account to your Batch account. If you haven't done so already, link a Storage account to your Batch account by using the [Azure portal](https://portal.azure.com):
5755

58-
1. Navigate to your Batch account in the Azure portal.
59-
2. Under **Settings**, select **Storage Account**.
60-
3. If you do not already have a Storage account associated with your Batch account, click **Storage Account (None)**.
61-
4. Select a Storage account from the list for your subscription. For best performance, use an Azure Storage account that is in the same region as the Batch account where your tasks are running.
56+
1. Navigate to your Batch account in the Azure portal.
57+
1. Under **Settings**, select **Storage Account**.
58+
1. If you do not already have a Storage account associated with your Batch account, click **Storage Account (None)**.
59+
1. Select a Storage account from the list for your subscription. For best performance, use an Azure Storage account that is in the same region as the Batch account where your tasks are running.
6260

6361
## Persist output data
6462

@@ -68,12 +66,10 @@ For more information about working with containers and blobs in Azure Storage, s
6866

6967
> [!WARNING]
7068
> All job and task outputs persisted with the File Conventions library are stored in the same container. If a large number of tasks try to persist files at the same time, [storage throttling limits](../storage/common/storage-performance-checklist.md#blobs) may be enforced.
71-
>
72-
>
7369
7470
### Create storage container
7571

76-
To persist task output to Azure Storage, first create a container by calling [CloudJob][net_cloudjob].[PrepareOutputStorageAsync][net_prepareoutputasync]. This extension method takes a [CloudStorageAccount][net_cloudstorageaccount] object as a parameter. It creates a container named according to the File Conventions standard,so that its contents are discoverable by the Azure portal and the retrieval methods discussed later in the article.
72+
To persist task output to Azure Storage, first create a container by calling [CloudJob][net_cloudjob].[PrepareOutputStorageAsync][net_prepareoutputasync]. This extension method takes a [CloudStorageAccount][net_cloudstorageaccount] object as a parameter. It creates a container named according to the File Conventions standard, so that its contents are discoverable by the Azure portal and the retrieval methods discussed later in the article.
7773

7874
You typically place the code to create a container in your client application — the application that creates your pools, jobs, and tasks.
7975

@@ -116,8 +112,6 @@ These output types allow you to specify which type of outputs to list when you l
116112

117113
> [!TIP]
118114
> The output kind also determines where in the Azure portal a particular file appears: *TaskOutput*-categorized files appear under **Task output files**, and *TaskLog* files appear under **Task logs**.
119-
>
120-
>
121115
122116
### Store job outputs
123117

@@ -170,8 +164,6 @@ The node agent is a program that runs on each node in the pool and provides the
170164

171165
> [!NOTE]
172166
> When you enable file tracking with **SaveTrackedAsync**, only *appends* to the tracked file are persisted to Azure Storage. Use this method only for tracking non-rotating log files or other files that are written to with append operations to the end of the file.
173-
>
174-
>
175167
176168
## Retrieve output data
177169

@@ -202,7 +194,7 @@ The Azure portal displays task output files and logs that are persisted to a lin
202194
To enable the display of your output files in the portal, you must satisfy the following requirements:
203195

204196
1. [Link an Azure Storage account](#requirement-linked-storage-account) to your Batch account.
205-
2. Adhere to the predefined naming conventions for Storage containers and files when persisting outputs. You can find the definition of these conventions in the File Conventions library [README][github_file_conventions_readme]. If you use the [Azure Batch File Conventions][nuget_package] library to persist your output, your files are persisted according to the File Conventions standard.
197+
1. Adhere to the predefined naming conventions for Storage containers and files when persisting outputs. You can find the definition of these conventions in the File Conventions library [README][github_file_conventions_readme]. If you use the [Azure Batch File Conventions][nuget_package] library to persist your output, your files are persisted according to the File Conventions standard.
206198

207199
To view task output files and logs in the Azure portal, navigate to the task whose output you are interested in, then click either **Saved output files** or **Saved logs**. This image shows the **Saved output files** for the task with ID "007":
208200

articles/batch/batch-task-output-files.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,18 @@ ms.devlang: multiple
1111
ms.topic: article
1212
ms.tgt_pltfrm:
1313
ms.workload: big-compute
14-
ms.date: 06/16/2017
14+
ms.date: 11/14/2018
1515
ms.author: danlep
1616

1717
---
1818

19-
2019
# Persist task data to Azure Storage with the Batch service API
2120

2221
[!INCLUDE [batch-task-output-include](../../includes/batch-task-output-include.md)]
2322

24-
Starting with version 2017-05-01, the Batch service API supports persisting output data to Azure Storage for tasks and job manager tasks that run on pools with the virtual machine configuration. When you add a task, you can specify a container in Azure Storage as the destination for the task's output. The Batch service then writes any output data to that container when the task is complete.
23+
The Batch service API supports persisting output data to Azure Storage for tasks and job manager tasks that run on pools with the virtual machine configuration. When you add a task, you can specify a container in Azure Storage as the destination for the task's output. The Batch service then writes any output data to that container when the task is complete.
2524

26-
An advantage to using the Batch service API to persist task output is that you do not need to modify the application that the task is running. Instead, with a few simple modifications to your client application, you can persist the task's output from within the code that creates the task.
25+
An advantage to using the Batch service API to persist task output is that you do not need to modify the application that the task is running. Instead, with a few modifications to your client application, you can persist the task's output from within the same code that creates the task.
2726

2827
## When do I use the Batch service API to persist task output?
2928

@@ -34,7 +33,10 @@ Azure Batch provides more than one way to persist task output. Using the Batch s
3433
- You want to persist output to an Azure Storage container with an arbitrary name.
3534
- You want to persist output to an Azure Storage container named according to the [Batch File Conventions standard](https://github.com/Azure/azure-sdk-for-net/tree/psSdkJson6/src/SDKs/Batch/Support/FileConventions#conventions).
3635

37-
If your scenario differs from those listed above, you may need to consider a different approach. For example, the Batch service API does not currently support streaming output to Azure Storage while the task is running. To stream output, consider using the Batch File Conventions library, available for .NET. For other languages, you'll need to implement your own solution. For more information on other options for persisting task output, see [Persist job and task output to Azure Storage](batch-task-output.md).
36+
> [!NOTE]
37+
> The Batch service API does not support persisting data from tasks running in pools created with the cloud service configuration. For information about persisting task output from pools running the cloud services configuration, see [Persist job and task data to Azure Storage with the Batch File Conventions library for .NET to persist ](batch-task-output-file-conventions.md).
38+
39+
If your scenario differs from those listed above, you may need to consider a different approach. For example, the Batch service API does not currently support streaming output to Azure Storage while the task is running. To stream output, consider using the Batch File Conventions library, available for .NET. For other languages, you'll need to implement your own solution. For more information on other options for persisting task output, see [Persist job and task output to Azure Storage](batch-task-output.md).
3840

3941
## Create a container in Azure Storage
4042

@@ -62,14 +64,14 @@ string containerSasToken = container.GetSharedAccessSignature(new SharedAccessBl
6264
Permissions = SharedAccessBlobPermissions.Write
6365
});
6466

65-
string containerSasUrl = container.Uri.AbsoluteUri + containerSasToken;
67+
string containerSasUrl = container.Uri.AbsoluteUri + containerSasToken;
6668
```
6769

6870
## Specify output files for task output
6971

70-
To specify output files for a task, create a collection of [OutputFile](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.outputfile) objects and assign it to the [CloudTask.OutputFiles](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.cloudtask.outputfiles#Microsoft_Azure_Batch_CloudTask_OutputFiles) property when you create the task.
72+
To specify output files for a task, create a collection of [OutputFile](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.outputfile) objects and assign it to the [CloudTask.OutputFiles](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.cloudtask.outputfiles#Microsoft_Azure_Batch_CloudTask_OutputFiles) property when you create the task.
7173

72-
The following .NET code example creates a task that writes random numbers to a file named `output.txt`. The example creates an output file for `output.txt` to be written to the container. The example also creates output files for any log files that match the file pattern `std*.txt` (_e.g._, `stdout.txt` and `stderr.txt`). The container URL requires the SAS that was created previously for the container. The Batch service uses the SAS to authenticate access to the container:
74+
The following C# code example creates a task that writes random numbers to a file named `output.txt`. The example creates an output file for `output.txt` to be written to the container. The example also creates output files for any log files that match the file pattern `std*.txt` (_e.g._, `stdout.txt` and `stderr.txt`). The container URL requires the SAS that was created previously for the container. The Batch service uses the SAS to authenticate access to the container:
7375

7476
```csharp
7577
new CloudTask(taskId, "cmd /v:ON /c \"echo off && set && (FOR /L %i IN (1,1,100000) DO (ECHO !RANDOM!)) > output.txt\"")
@@ -99,7 +101,7 @@ new CloudTask(taskId, "cmd /v:ON /c \"echo off && set && (FOR /L %i IN (1,1,1000
99101

100102
When you specify an output file, you can use the [OutputFile.FilePattern](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.outputfile.filepattern#Microsoft_Azure_Batch_OutputFile_FilePattern) property to specify a file pattern for matching. The file pattern may match zero files, a single file, or a set of files that are created by the task.
101103
102-
The **FilePattern** property supports standard filesystem wildcards such as `*` (for non-recursive matches) and `**` (for recursive matches). For example, the code sample above specifies the file pattern to match `std*.txt` non-recursively:
104+
The **FilePattern** property supports standard filesystem wildcards such as `*` (for non-recursive matches) and `**` (for recursive matches). For example, the code sample above specifies the file pattern to match `std*.txt` non-recursively:
103105

104106
`filePattern: @"..\std*.txt"`
105107

@@ -111,7 +113,7 @@ To upload a single file, specify a file pattern with no wildcards. For example,
111113

112114
The [OutputFileUploadOptions.UploadCondition](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.outputfileuploadoptions.uploadcondition#Microsoft_Azure_Batch_OutputFileUploadOptions_UploadCondition) property permits conditional uploading of output files. A common scenario is to upload one set of files if the task succeeds, and a different set of files if it fails. For example, you may want to upload verbose log files only when the task fails and exits with a nonzero exit code. Similarly, you may want to upload result files only if the task succeeds, as those files may be missing or incomplete if the task fails.
113115
114-
The code sample above sets the **UploadCondition** property to **TaskCompletion**. This setting specifies that the file is to be uploaded after the tasks completes, regardless of the value of the exit code.
116+
The code sample above sets the **UploadCondition** property to **TaskCompletion**. This setting specifies that the file is to be uploaded after the tasks completes, regardless of the value of the exit code.
115117

116118
`uploadCondition: OutputFileUploadCondition.TaskCompletion`
117119

@@ -143,10 +145,9 @@ https://myaccount.blob.core.windows.net/mycontainer/task2/output.txt
143145

144146
For more information about virtual directories in Azure Storage, see [List the blobs in a container](../storage/blobs/storage-quickstart-blobs-dotnet.md#list-the-blobs-in-a-container).
145147

146-
147148
## Diagnose file upload errors
148149

149-
If uploading output files to Azure Storage fails, then the task moves to the **Completed** state and the [TaskExecutionInformation.​FailureInformation](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.taskexecutioninformation.failureinformation#Microsoft_Azure_Batch_TaskExecutionInformation_FailureInformation) property is set. Examine the **FailureInformation** property to determine what error occurred. For example, here is an error that occurs on file upload if the container cannot be found:
150+
If uploading output files to Azure Storage fails, then the task moves to the **Completed** state and the [TaskExecutionInformation.​FailureInformation](https://docs.microsoft.com/dotnet/api/microsoft.azure.batch.taskexecutioninformation.failureinformation#Microsoft_Azure_Batch_TaskExecutionInformation_FailureInformation) property is set. Examine the **FailureInformation** property to determine what error occurred. For example, here is an error that occurs on file upload if the container cannot be found:
150151
151152
```
152153
Category: UserError

0 commit comments

Comments
 (0)