Skip to content

Commit f7f9430

Browse files
dkmillerSyntaxC4
authored andcommitted
Add clarification on use of AZ_BATCH_MASTER_NODE
The official Azure ML sample code here ( https://github.com/microsoft/AzureML-BERT/blob/master/pretrain/PyTorch/azureml_adapter.py ) is out of date, and incorrectly uses the port in setting up the NCCL backend. This changes fixes that.
1 parent 7520c61 commit f7f9430

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

articles/batch/batch-compute-node-environment-variables.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ The command lines executed by tasks on compute nodes do not run under a shell. T
4444
| AZ_BATCH_JOB_ID | The ID of the job that the task belongs to. | All tasks except start task. | batchjob001 |
4545
| AZ_BATCH_JOB_PREP_DIR | The full path of the job preparation [task directory][files_dirs] on the node. | All tasks except start task and job preparation task. Only available if the job is configured with a job preparation task. | C:\user\tasks\workitems\jobprepreleasesamplejob\job-1\jobpreparation |
4646
| AZ_BATCH_JOB_PREP_WORKING_DIR | The full path of the job preparation [task working directory][files_dirs] on the node. | All tasks except start task and job preparation task. Only available if the job is configured with a job preparation task. | C:\user\tasks\workitems\jobprepreleasesamplejob\job-1\jobpreparation\wd |
47-
| AZ_BATCH_MASTER_NODE | The IP address and port of the compute node on which the primary task of a [multi-instance task][multi_instance] runs. | Multi-instance primary and subtasks. | `10.0.0.4:6000` |
47+
| AZ_BATCH_MASTER_NODE | The IP address and port of the compute node on which the primary task of a [multi-instance task][multi_instance] runs. Do not use the port specified here for MPI or NCCL communication - it is reserved for the Azure Batch service. Use the variable MASTER_PORT instead. | Multi-instance primary and subtasks. | `10.0.0.4:6000` |
4848
| AZ_BATCH_NODE_ID | The ID of the node that the task is assigned to. | All tasks. | tvm-1219235766_3-20160919t172711z |
4949
| AZ_BATCH_NODE_IS_DEDICATED | If `true`, the current node is a dedicated node. If `false`, it is a [low-priority node](batch-low-pri-vms.md). | All tasks. | `true` |
5050
| AZ_BATCH_NODE_LIST | The list of nodes that are allocated to a [multi-instance task][multi_instance] in the format `nodeIP;nodeIP`. | Multi-instance primary and subtasks. | `10.0.0.4;10.0.0.5` |

0 commit comments

Comments
 (0)