Skip to content

Commit c1244cd

Browse files
authored
Merge pull request #78080 from laurenhughes/batch-node-errors
Container section and more examples
2 parents 474afc3 + d7d2f14 commit c1244cd

File tree

1 file changed

+11
-2
lines changed

1 file changed

+11
-2
lines changed

articles/batch/batch-pool-node-error-checking.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: batch
55
ms.service: batch
66
author: mscurrell
77
ms.author: markscu
8-
ms.date: 9/25/2018
8+
ms.date: 05/28/2019
99
ms.topic: conceptual
1010
---
1111

@@ -79,19 +79,28 @@ You can specify one or more application packages for a pool. Batch downloads the
7979

8080
The node [errors](https://docs.microsoft.com/rest/api/batchservice/computenode/get#computenodeerror) property reports a failure to download and uncompress an application package. Batch sets the node state to **unusable**.
8181

82+
### Container download failure
83+
84+
You can specify one or more container references on a pool. Batch downloads the specified containers to each node. The node [errors](https://docs.microsoft.com/rest/api/batchservice/computenode/get#computenodeerror) property reports a failure to download a container and sets the node state to **unusable**.
85+
8286
### Node in unusable state
8387

8488
Azure Batch might set the [node state](https://docs.microsoft.com/rest/api/batchservice/computenode/get#computenodestate) to **unusable** for many reasons. With the node state set to **unusable**, tasks can't be scheduled to the node, but it still incurs charges.
8589

86-
Batch always tries to recover unusable nodes, but recovery may or may not be possible depending on the cause.
90+
Nodes in an **unsuable**, but without [errors](https://docs.microsoft.com/rest/api/batchservice/computenode/get#computenodeerror) state means that Batch is unable to communicate with the VM. In this case, Batch always tries to recover the VM. Batch will not automatically attempt to recover VMs which failed to install application packages or containers even though their state is **unusable**.
8791

8892
If Batch can determine the cause, the node [errors](https://docs.microsoft.com/rest/api/batchservice/computenode/get#computenodeerror) property reports it.
8993

9094
Additional examples of causes for **unusable** nodes include:
9195

9296
- A custom VM image is invalid. For example, an image that's not properly prepared.
97+
9398
- A VM is moved because of an infrastructure failure or a low-level upgrade. Batch recovers the node.
9499

100+
- A VM image has been deployed on hardware which doesn’t support it. For example an “HPC” VM image running on non-HPC hardware. For example, trying to run a CentOS HPC image on a [Standard_D1_v2](../virtual-machines/linux/sizes-general.md#dv2-series) VM.
101+
102+
- The VMs are in an [Azure virtual network](batch-virtual-network.md), and traffic has been blocked to key ports.
103+
95104
### Node agent log files
96105

97106
The Batch agent process that runs on each pool node can provide log files which might be helpful if you need to contact support about a pool node issue. Log files for a node can be uploaded via the Azure portal, Batch Explorer, or an [API](https://docs.microsoft.com/rest/api/batchservice/computenode/uploadbatchservicelogs). It's useful to upload and save the log files. Afterward, you can delete the node or pool to save the cost of the running nodes.

0 commit comments

Comments
 (0)