Skip to content

Commit 5ba3fb3

Browse files
committed
updates
1 parent 322d813 commit 5ba3fb3

5 files changed

+13
-12
lines changed

articles/machine-learning/how-to-azure-container-for-pytorch-environment.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Create Azure Container for PyTorch Custom Curated Environment
2+
title: How to create Azure Container for PyTorch Custom Curated Environment
33
titleSuffix: Azure Machine Learning
44
description: Create custom curated Azure Container for PyTorch Environments in Azure Machine Learning studio to run your machine learning models and reuse it in different scenarios.
55
services: machine-learning
@@ -20,7 +20,7 @@ If you're looking to extend curated environment and add Hugging Face (HF) transf
2020

2121
Before following the steps in this article, make sure you have the following prerequisites:
2222

23-
- An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/en-us/free/).
23+
- An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).
2424

2525
- An Azure Machine Learning workspace. If you don't have one, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create one.
2626

articles/machine-learning/how-to-train-distributed-gpu.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ Azure Machine Learning will set the `MASTER_ADDR`, `MASTER_PORT`, `WORLD_SIZE`,
128128

129129
`DeepSpeed` can be enabled using either Pytorch distribution or MPI for running distributed training. Azure Machine Learning supports the `DeepSpeed` launcher to launch distributed training as well as autotuning to get optimal `ds` configuration.
130130

131-
You can use a [curated environment](resource-curated-environments.md#azure-container-for-pytorch-acpt-preview) for an out of the box environment with the latest state of art technologies including `DeepSpeed`, `ORT`, `MSSCCL`, and `Pytorch` for your DeepSpeed training jobs.
131+
You can use a [curated environment](resource-curated-environments.md#azure-container-for-pytorch-acpt) for an out of the box environment with the latest state of art technologies including `DeepSpeed`, `ORT`, `MSSCCL`, and `Pytorch` for your DeepSpeed training jobs.
132132

133133
### DeepSpeed example
134134

articles/machine-learning/reference-checkpoint-performance-for-large-models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Nebula can
7878
* An Azure subscription and an Azure Machine Learning workspace. See [Create workspace resources](./quickstart-create-resources.md) for more information about workspace resource creation
7979
* An Azure Machine Learning compute target. See [Manage training & deploy computes](./how-to-create-attach-compute-studio.md) to learn more about compute target creation
8080
* A training script that uses **PyTorch**.
81-
* ACPT-curated (Azure Container for Pytorch) environment. See [Curated environments](resource-curated-environments.md#azure-container-for-pytorch-acpt-preview) to obtain the ACPT image. Learn how to use the curated environment [here](./how-to-use-environments.md)
81+
* ACPT-curated (Azure Container for Pytorch) environment. See [Curated environments](resource-curated-environments.md#azure-container-for-pytorch-acpt) to obtain the ACPT image. Learn how to [use the curated environment](./how-to-use-environments.md)
8282
* An Azure Machine Learning script run configuration file. If you don’t have one, you can follow [this resource](./how-to-set-up-training-targets.md)
8383

8484
## How to Use Nebula
@@ -90,7 +90,7 @@ Nebula use involves:
9090
- [API calls to save and load checkpoints](#call-apis-to-save-and-load-checkpoints)
9191

9292
### Using ACPT environment
93-
[Azure Container for PyTorch (ACPT)](how-to-manage-environments-v2.md?tabs=cli#curated-environments), a curated environment for PyTorch model training, includes Nebula as a preinstalled, dependent Python package. See [Azure Container for PyTorch (ACPT)](resource-curated-environments.md#azure-container-for-pytorch-acpt-preview) to view the curated environment, and [Enabling Deep Learning with Azure Container for PyTorch in Azure Machine Learning](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/enabling-deep-learning-with-azure-container-for-pytorch-in-azure/ba-p/3650489) to learn more about the ACPT image.
93+
[Azure Container for PyTorch (ACPT)](how-to-manage-environments-v2.md?tabs=cli#curated-environments), a curated environment for PyTorch model training, includes Nebula as a preinstalled, dependent Python package. See [Azure Container for PyTorch (ACPT)](resource-curated-environments.md#azure-container-for-pytorch-acpt) to view the curated environment, and [Enabling Deep Learning with Azure Container for PyTorch in Azure Machine Learning](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/enabling-deep-learning-with-azure-container-for-pytorch-in-azure/ba-p/3650489) to learn more about the ACPT image.
9494

9595
### Initializing Nebula
9696

articles/machine-learning/resource-azure-container-for-pytorch.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,16 @@ The following configurations are supported:
4545
| Environment Name | OS | GPU Version| Python Version | PyTorch Version | ORT-training Version | DeepSpeed Version | torch-ort Version |
4646
| --- | --- | --- | --- | --- | --- | --- | --- |
4747
|acpt-pytorch-2.0-cuda11.7|Ubuntu 20.04|cu117|3.8|2.0|1.14.1|0.8.2 |0.14.0|
48-
|acpt-pytorch-1.13-cuda11.7|Ubuntu 20.04|cu117|3.8|1.13.1|1.14.1|0.8.2| 1.14.0|
49-
|acpt-pytorch-1.12-py39-cuda11.6|Ubuntu 20.04|cu116|3.9 |1.12.1|1.14.1| 0.8.2|1.14.0|
50-
|acpt-pytorch-1.12-cuda11.6|Ubuntu 20.04|cu116|3.8|1.12.1|1.14.1|0.8.2| 1.14.0|
51-
|acpt-pytorch-1.11-cuda11.5|Ubuntu 20.04|cu115|3.8|1.11.0|1.11.1|0.7.3| 1.11.0| 
52-
|acpt-pytorch-1.11-cuda11.3|Ubuntu 20.04|cu113|3.8|1.11.0|1.14.1|0.8.2| 1.14.0| 
53-
48+
|acpt-pytorch-1.13-cuda11.7|Ubuntu 20.04|cu117|3.8|1.13.1|1.14.1|0.8.2|1.14.0|
49+
|acpt-pytorch-1.12-py39-cuda11.6|Ubuntu 20.04|cu116|3.9|1.12.1|1.14.1|0.8.2|1.14.0|
50+
|acpt-pytorch-1.12-cuda11.6|Ubuntu 20.04|cu116|3.8|1.12.1|1.14.1|0.8.2|1.14.0|
51+
|acpt-pytorch-1.11-cuda11.5|Ubuntu 20.04|cu115|3.8|1.11.0|1.11.1|0.7.3|1.11.0|
52+
|acpt-pytorch-1.11-cuda11.3|Ubuntu 20.04|cu113|3.8|1.11.0|1.14.1|0.8.2|1.14.0|
5453

5554
Other packages like fairscale, horovod, msccl, protobuf, pyspark, pytest, pytorch-lightning, tensorboard, NebulaML, torchvision, torchmetrics to support all training needs
5655

56+
To learn more, see [Create custom ACPT curated environments](how-to-azure-container-for-pytorch-environment.md).
57+
5758
[!NOTE]
5859
> Currently, due to underlying cuda and cluster incompatibilities, on [NC series](../virtual-machines/nc-series.md) only acpt-pytorch-1.11-cuda11.3 with cuda 11.3 and torch 1.11 can be used.
5960

articles/machine-learning/resource-curated-environments.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ This article lists the curated environments with latest framework versions in Az
3737
**Name**: AzureML-ACPT-pytorch-1.12-py39-cuda11.6-gpu
3838
**Description**: The Azure Curated Environment for PyTorch is our latest PyTorch curated environment. It is optimized for large, distributed deep learning workloads and comes pre-packaged with the best of Microsoft technologies for accelerated training, e.g., OnnxRuntime Training (ORT), DeepSpeed, MSCCL, etc.
3939

40-
The learn more, see [Azure Container for PyTorch (ACPT)](reference-azure-container-for-pytorch.md).
40+
The learn more, see [Azure Container for PyTorch (ACPT)](resource-azure-container-for-pytorch.md).
4141

4242
[!NOTE]
4343
> Currently, due to underlying cuda and cluster incompatibilities, on [NC series](../virtual-machines/nc-series.md) only acpt-pytorch-1.11-cuda11.3 with cuda 11.3 and torch 1.11 can be used.

0 commit comments

Comments
 (0)