Skip to content

Commit f7ace6c

Browse files
authored
Merge pull request #231258 from lgayhardt/amlacpt0323
Azure Container for PyTorch
2 parents 3e9605b + 69036f7 commit f7ace6c

13 files changed

+155
-26
lines changed
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: How to create Azure Container for PyTorch Custom Curated environment
3+
titleSuffix: Azure Machine Learning
4+
description: Create custom curated Azure Container for PyTorch environments in Azure Machine Learning studio to run your machine learning models and reuse it in different scenarios.
5+
services: machine-learning
6+
author: sheetalarkadam
7+
ms.author: parinitarahi
8+
ms.reviewer: ssalgado
9+
ms.service: machine-learning
10+
ms.subservice: core
11+
ms.topic: how-to
12+
ms.date: 03/20/2023
13+
---
14+
15+
# Create custom curated Azure Container for PyTorch (ACPT) environments in Azure Machine Learning studio
16+
17+
If you're looking to extend curated environment and add Hugging Face (HF) transformers or datasets or any other external packages to be installed, Azure Machine Learning offers to create a new env with docker context containing ACPT curated environment as base image and additional packages on top of it as below.
18+
19+
## Prerequisites
20+
21+
Before following the steps in this article, make sure you have the following prerequisites:
22+
23+
- An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).
24+
25+
- An Azure Machine Learning workspace. If you don't have one, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create one.
26+
27+
## Navigate to environments
28+
29+
In the [Azure Machine Learning studio](https://ml.azure.com/registries/environments), navigate to the "Environments" section by selecting the "Environments" option.
30+
31+
:::image type="content" source="./media/how-to-azure-container-for-pytorch-environment/navigate-to-environments.png" alt-text="Screenshot of navigating to environments from Azure Machine Learning studio." lightbox= "./media/how-to-azure-container-for-pytorch-environment/navigate-to-environments.png":::
32+
33+
## Navigate to curated environments
34+
35+
Navigate to curated environments and search "acpt" to list all the available ACPT curated environments. Selecting the environment shows details of the environment.
36+
37+
:::image type="content" source="./media/how-to-azure-container-for-pytorch-environment/navigate-to-curated-environments.png" alt-text="Screenshot of navigating to curated environments." lightbox= "./media/how-to-azure-container-for-pytorch-environment/navigate-to-curated-environments.png":::
38+
39+
40+
## Get details of the curated environments
41+
42+
To create custom environment, you need the base docker image repository, which can be found in the "Description" section as "Azure Container Registry". Copy the "Azure Container Registry" name, which is used later when you create a new custom environment.
43+
44+
:::image type="content" source="./media/how-to-azure-container-for-pytorch-environment/get-details-curated-environments.png" alt-text="Screenshot of getting container registry name." lightbox= "./media/how-to-azure-container-for-pytorch-environment/get-details-curated-environments.png":::
45+
46+
## Navigate to custom environments
47+
48+
Go back and select the " Custom Environments" tab.
49+
50+
:::image type="content" source="./media/how-to-azure-container-for-pytorch-environment/navigate-to-custom-environment.png" alt-text="Screenshot of navigating to custom environments." lightbox= "./media/how-to-azure-container-for-pytorch-environment/navigate-to-custom-environment.png":::
51+
52+
## Create custom environments
53+
54+
Select **+ Create**. In the "Create Environment" window, name the environment, description and select "Create a new docker context" in Select environments type section.
55+
56+
:::image type="content" source="./media/how-to-azure-container-for-pytorch-environment/create-environment-window.png" alt-text="Screenshot of creating custom environment." lightbox= "./media/how-to-azure-container-for-pytorch-environment/create-environment-window.png":::
57+
58+
Paste the docker image name that you copied in previously. Configure your environment by declaring the base image and add any env variables you want to use and the packages that you want to include.
59+
60+
:::image type="content" source="./media/how-to-azure-container-for-pytorch-environment/configure-environment.png" alt-text="Screenshot of configuring the environment with name, packages with docker context." lightbox= "./media/how-to-azure-container-for-pytorch-environment/configure-environment.png":::
61+
62+
Review your environment settings, add any tags if needed and select on the **Create** button to create your custom environment.
63+
64+
That's it! You've now created a custom environment in Azure Machine Learning studio and can use it to run your machine learning models.
65+
66+
## Next steps
67+
68+
- Learn more about environment objects:
69+
- [What are Azure Machine Learning environments? ](concept-environments.md).
70+
- Learn more about [curated environments](concept-environments.md).
71+
- Learn more about [training models in Azure Machine Learning](concept-train-machine-learning-model.md).
72+
- [Azure Container for PyTorch (ACPT) reference](resource-azure-container-for-pytorch.md)

articles/machine-learning/how-to-train-distributed-gpu.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,14 +121,14 @@ Azure Machine Learning will set the `MASTER_ADDR`, `MASTER_PORT`, `WORLD_SIZE`,
121121

122122
## DeepSpeed
123123

124-
[DeepSpeed](https://www.deepspeed.ai/tutorials/azure/) is supported as a first-class citizen within Azure Machine Learning to run distributed jobs with near linear scalabibility in terms of 
124+
[DeepSpeed](https://www.deepspeed.ai/tutorials/azure/) is supported as a first-class citizen within Azure Machine Learning to run distributed jobs with near linear scalability in terms of 
125125

126126
* Increase in model size
127127
* Increase in number of GPUs
128128

129129
`DeepSpeed` can be enabled using either Pytorch distribution or MPI for running distributed training. Azure Machine Learning supports the `DeepSpeed` launcher to launch distributed training as well as autotuning to get optimal `ds` configuration.
130130

131-
You can use a [curated environment](resource-curated-environments.md#azure-container-for-pytorch-acpt-preview) for an out of the box environment with the latest state of art technologies including `DeepSpeed`, `ORT`, `MSSCCL`, and `Pytorch` for your DeepSpeed training jobs.
131+
You can use a [curated environment](resource-curated-environments.md#azure-container-for-pytorch-acpt) for an out of the box environment with the latest state of art technologies including `DeepSpeed`, `ORT`, `MSSCCL`, and `Pytorch` for your DeepSpeed training jobs.
132132

133133
### DeepSpeed example
134134

109 KB
Loading
76.2 KB
Loading
79.5 KB
Loading
Loading
Loading
57.6 KB
Loading
395 KB
Loading

articles/machine-learning/reference-checkpoint-performance-for-large-models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Nebula can
7878
* An Azure subscription and an Azure Machine Learning workspace. See [Create workspace resources](./quickstart-create-resources.md) for more information about workspace resource creation
7979
* An Azure Machine Learning compute target. See [Manage training & deploy computes](./how-to-create-attach-compute-studio.md) to learn more about compute target creation
8080
* A training script that uses **PyTorch**.
81-
* ACPT-curated (Azure Container for Pytorch) environment. See [Curated environments](resource-curated-environments.md#azure-container-for-pytorch-acpt-preview) to obtain the ACPT image. Learn how to use the curated environment [here](./how-to-use-environments.md)
81+
* ACPT-curated (Azure Container for Pytorch) environment. See [Curated environments](resource-curated-environments.md#azure-container-for-pytorch-acpt) to obtain the ACPT image. Learn how to [use the curated environment](./how-to-use-environments.md)
8282
* An Azure Machine Learning script run configuration file. If you don’t have one, you can follow [this resource](./how-to-set-up-training-targets.md)
8383

8484
## How to Use Nebula
@@ -90,7 +90,7 @@ Nebula use involves:
9090
- [API calls to save and load checkpoints](#call-apis-to-save-and-load-checkpoints)
9191

9292
### Using ACPT environment
93-
[Azure Container for PyTorch (ACPT)](how-to-manage-environments-v2.md?tabs=cli#curated-environments), a curated environment for PyTorch model training, includes Nebula as a preinstalled, dependent Python package. See [Azure Container for PyTorch (ACPT)](resource-curated-environments.md#azure-container-for-pytorch-acpt-preview) to view the curated environment, and [Enabling Deep Learning with Azure Container for PyTorch in Azure Machine Learning](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/enabling-deep-learning-with-azure-container-for-pytorch-in-azure/ba-p/3650489) to learn more about the ACPT image.
93+
[Azure Container for PyTorch (ACPT)](how-to-manage-environments-v2.md?tabs=cli#curated-environments), a curated environment for PyTorch model training, includes Nebula as a preinstalled, dependent Python package. See [Azure Container for PyTorch (ACPT)](resource-curated-environments.md#azure-container-for-pytorch-acpt) to view the curated environment, and [Enabling Deep Learning with Azure Container for PyTorch in Azure Machine Learning](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/enabling-deep-learning-with-azure-container-for-pytorch-in-azure/ba-p/3650489) to learn more about the ACPT image.
9494

9595
### Initializing Nebula
9696

0 commit comments

Comments
 (0)