Skip to content

Commit ba7e9f0

Browse files
authored
Merge pull request #126280 from sdgilley/sdg-setup-computes
Refactor Use compute targets for model training
2 parents 4b69d1d + ec6fd04 commit ba7e9f0

32 files changed

+888
-753
lines changed

articles/machine-learning/concept-azure-machine-learning-architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ A run configuration is a set of instructions that defines how a script should be
103103

104104
A run configuration can be persisted into a file inside the directory that contains your training script. Or it can be constructed as an in-memory object and used to submit a run.
105105

106-
For example run configurations, see [Select and use a compute target to train your model](how-to-set-up-training-targets.md).
106+
For example run configurations, see [Use a compute target to train your model](how-to-set-up-training-targets.md).
107107

108108
### Estimators
109109

articles/machine-learning/concept-compute-instance.md

Lines changed: 3 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Compute instances make it easy to get started with Azure Machine Learning develo
2020

2121
Use a compute instance as your fully configured and managed development environment in the cloud for machine learning. They can also be used as a compute target for training and inferencing for development and testing purposes.
2222

23-
For production grade model training use an [Azure Machine Learning compute cluster](how-to-set-up-training-targets.md#amlcompute) with multi-node scaling capabilities. For production grade model deployment, use [Azure Kubernetes Service cluster](how-to-deploy-azure-kubernetes-service.md).
23+
For production grade model training use an [Azure Machine Learning compute cluster](how-to-create-attach-compute-sdk.md#amlcompute) with multi-node scaling capabilities. For production grade model deployment, use [Azure Kubernetes Service cluster](how-to-deploy-azure-kubernetes-service.md).
2424

2525
## Why use a compute instance?
2626

@@ -135,18 +135,7 @@ These actions can be controlled by RBAC:
135135

136136
### <a name="create"></a>Create a compute instance
137137

138-
In your workspace in Azure Machine Learning studio, create a new compute instance from either the **Compute** section or in the **Notebooks** section when you are ready to run one of your notebooks.
139-
140-
:::image type="content" source="media/concept-compute-instance/create-compute-instance.png" alt-text="Create a new compute instance":::
141-
142-
143-
|Field |Description |
144-
|---------|---------|
145-
|Compute name | <li>Name is required and must be between 3 to 24 characters long.</li><li>Valid characters are upper and lower case letters, digits, and the **-** character.</li><li>Name must start with a letter</li><li>Name needs to be unique across all existing computes within an Azure region. You will see an alert if the name you choose is not unique</li><li>If **-** character is used, then it needs to be followed by at least one letter later in the name</li> |
146-
|Virtual machine type | Choose CPU or GPU. This type cannot be changed after creation |
147-
|Virtual machine size | Supported virtual machine sizes might be restricted in your region. Check the [availability list](https://azure.microsoft.com/global-infrastructure/services/?products=virtual-machines) |
148-
|Enable/disable SSH access | SSH access is disabled by default. SSH access cannot be. changed after creation. Make sure to enable access if you plan to debug interactively with [VS Code Remote](how-to-set-up-vs-code-remote.md) |
149-
|Advanced settings | Optional. Configure a virtual network. Specify the **Resource group**, **Virtual network**, and **Subnet** to create the compute instance inside an Azure Virtual Network (vnet). For more information, see these [network requirements](how-to-enable-virtual-network.md#compute-instance) for vnet . |
138+
In your workspace in Azure Machine Learning studio, [create a new compute instance](how-to-create-attach-compute-studio.md#compute-instance) from either the **Compute** section or in the **Notebooks** section when you are ready to run one of your notebooks.
150139

151140
You can also create an instance
152141
* Directly from the [integrated notebooks experience](tutorial-1st-experiment-sdk-setup.md#azure)
@@ -155,7 +144,7 @@ You can also create an instance
155144
* With Azure Machine Learning SDK
156145
* From the [CLI extension for Azure Machine Learning](reference-azure-machine-learning-cli.md#computeinstance)
157146

158-
The dedicated cores per region per VM family quota and total regional quota, which applies to compute instance creation. is unified and shared with Azure Machine Learning training compute cluster quota. Stopping the compute instance does not release quota to ensure you will be able to restart the compute instance.
147+
The dedicated cores per region per VM family quota and total regional quota, which applies to compute instance creation, is unified and shared with Azure Machine Learning training compute cluster quota. Stopping the compute instance does not release quota to ensure you will be able to restart the compute instance.
159148

160149
## Compute target
161150

articles/machine-learning/concept-compute-target.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Azure Machine Learning has varying support across different compute resources.
2929

3030
[!INCLUDE [aml-compute-target-train](../../includes/aml-compute-target-train.md)]
3131

32-
Learn more about [setting up and using a compute target for model training](how-to-set-up-training-targets.md).
32+
Learn more about [using a compute target for model training](how-to-set-up-training-targets.md).
3333

3434
## <a name="deploy"></a>Deployment targets
3535

@@ -42,10 +42,10 @@ Learn [where and how to deploy your model to a compute target](how-to-deploy-and
4242
<a name="amlcompute"></a>
4343
## Azure Machine Learning compute (managed)
4444

45-
A managed compute resource is created and managed by Azure Machine Learning. This compute is optimized for machine learning workloads. Azure Machine Learning compute clusters and [compute instances](concept-compute-instance.md) are the only managed computes. Additional managed compute resources may be added in the future.
45+
A managed compute resource is created and managed by Azure Machine Learning. This compute is optimized for machine learning workloads. Azure Machine Learning compute clusters and [compute instances](concept-compute-instance.md) are the only managed computes.
4646

4747
You can create Azure Machine Learning compute instances or compute clusters from:
48-
* Azure Machine Learning studio
48+
* [Azure Machine Learning studio](how-to-create-attach-compute-studio.md)
4949
* Azure portal
5050
* Python SDK [ComputeInstance](https://docs.microsoft.com/python/api/azureml-core/azureml.core.compute.computeinstance(class)?view=azure-ml-py) and [AmlCompute](https://docs.microsoft.com/python/api/azureml-core/azureml.core.compute.amlcompute(class)?view=azure-ml-py) classes
5151
* [R SDK](https://azure.github.io/azureml-sdk-for-r/reference/index.html#section-compute-targets) (preview)
@@ -64,7 +64,7 @@ When created these compute resources are automatically part of your workspace, u
6464

6565

6666
> [!NOTE]
67-
> When a compute cluster is idle, it autoscales to 0 nodes, so you don't pay when it's not in use. A compute *instance*, however, is always on and does not autoscale. You should [stop the compute instance](tutorial-1st-experiment-sdk-train.md#stop-the-compute-instance) when you are not using it to avoid extra cost.
67+
> When a compute cluster is idle, it autoscales to 0 nodes, so you don't pay when it's not in use. A compute *instance*, however, is always on and does not autoscale. You should [stop the compute instance](tutorial-1st-experiment-sdk-train.md#stop-the-compute-instance) when you are not using it to avoid extra cost.
6868
6969
### Supported VM series and sizes
7070

@@ -103,5 +103,5 @@ An unmanaged compute target is *not* managed by Azure Machine Learning. You crea
103103
## Next steps
104104

105105
Learn how to:
106-
* [Set up a compute target to train your model](how-to-set-up-training-targets.md)
106+
* [Use a compute target to train your model](how-to-set-up-training-targets.md)
107107
* [Deploy your model to a compute target](how-to-deploy-and-where.md)

articles/machine-learning/concept-distributed-training.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ In model parallelism, worker nodes only need to synchronize the shared parameter
4545

4646
## Next steps
4747

48-
* Learn how to [set up training environments](how-to-set-up-training-targets.md) with the Python SDK.
48+
* Learn how to [use compute targets for model training](how-to-set-up-training-targets.md) with the Python SDK.
4949
* For a technical example, see the [reference architecture scenario](https://docs.microsoft.com/azure/architecture/reference-architectures/ai/training-deep-learning).
5050
* [Train ML models with TensorFlow](how-to-train-tensorflow.md).
5151
* [Train ML models with PyTorch](how-to-train-pytorch.md).

articles/machine-learning/concept-plan-manage-cost.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ With constantly changing data, you need fast and streamlined model training and
6464

6565
Azure Machine Learning users can use the managed Azure Machine Learning compute cluster, also called AmlCompute. AmlCompute supports a variety of GPU and CPU options. The AmlCompute is internally hosted on behalf of your subscription by Azure Machine Learning. It provides the same enterprise grade security, compliance and governance at Azure IaaS cloud scale.
6666

67-
Because these compute pools are inside of Azure's IaaS infrastructure, you can deploy, scale, and manage your training with the same security and compliance requirements as the rest of your infrastructure. These deployments occur in your subscription and obey your governance rules. Learn more about [Azure Machine Learning Compute](how-to-set-up-training-targets.md#amlcompute).
67+
Because these compute pools are inside of Azure's IaaS infrastructure, you can deploy, scale, and manage your training with the same security and compliance requirements as the rest of your infrastructure. These deployments occur in your subscription and obey your governance rules. Learn more about [Azure Machine Learning compute](how-to-create-attach-compute-sdk.md#amlcompute).
6868

6969
## Configure training clusters for autoscaling
7070

@@ -122,4 +122,4 @@ Azure Machine Learning Compute supports reserved instances inherently. If you pu
122122
Learn more about:
123123
* [Manage and increase resource quotas](how-to-manage-quotas.md)
124124
* [Managing costs with cost analysis](../cost-management-billing/costs/quick-acm-cost-analysis.md).
125-
* [Azure Machine Learning compute](how-to-set-up-training-targets.md#amlcompute).
125+
* Create Azure Machine Learning compute with [SDK](how-to-create-attach-compute-sdk.md#amlcompute) or in [studio](how-to-create-attach-compute-studio.md#amlcompute).

articles/machine-learning/concept-train-machine-learning-model.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ You may start with a run configuration for your local computer, and then switch
5050
* [What is a run configuration?](concept-azure-machine-learning-architecture.md#run-configurations)
5151
* [Tutorial: Train your first ML model](tutorial-1st-experiment-sdk-train.md)
5252
* [Examples: Jupyter Notebook examples of training models](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/training)
53-
* [How to: Set up and use compute targets for model training](how-to-set-up-training-targets.md)
53+
* [How to: Use compute targets for model training](how-to-set-up-training-targets.md)
5454

5555
### Automated Machine Learning
5656

@@ -155,4 +155,4 @@ You can use the VS Code extension to run and manage your training jobs. See the
155155

156156
## Next steps
157157

158-
Learn how to [Set up training environments](how-to-set-up-training-targets.md).
158+
Learn how to [Use compute targets for model training](how-to-set-up-training-targets.md).

articles/machine-learning/concept-train-model-git-integration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,4 +110,4 @@ For more information, see the [az ml run](https://docs.microsoft.com/cli/azure/e
110110

111111
## Next steps
112112

113-
* [Set up and use compute targets for model training](how-to-set-up-training-targets.md)
113+
* [Use compute targets for model training](how-to-set-up-training-targets.md)

articles/machine-learning/data-science-virtual-machine/dsvm-pools.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ ms.date: 12/10/2018
1717

1818
In this article, you'll learn how to create a shared pool of Data Science Virtual Machines (DSVMs) for a team. The benefits of using a shared pool include better resource utilization, easier sharing and collaboration, and more effective management of DSVM resources.
1919

20-
You can use many methods and technologies to create a pool of DSVMs. This article focuses on pools for interactive virtual machines (VMs). An alternative managed compute infrastructure is Azure Machine Learning Compute. For more information, see [Set up compute targets](../how-to-set-up-training-targets.md#amlcompute).
20+
You can use many methods and technologies to create a pool of DSVMs. This article focuses on pools for interactive virtual machines (VMs). An alternative managed compute infrastructure is Azure Machine Learning Compute. For more information, see [Create compute targets with Python SDK](../how-to-create-attach-compute-sdk.md).
2121

2222
## Interactive VM pool
2323

articles/machine-learning/how-to-auto-train-forecast.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,6 @@ To enable DNN for an AutoML experiment created in the Azure Machine Learning stu
252252

253253
Automated ML provides users with both native time-series and deep learning models as part of the recommendation system.
254254

255-
256255
Models| Description | Benefits
257256
----|----|---
258257
Prophet (Preview)|Prophet works best with time series that have strong seasonal effects and several seasons of historical data. To leverage this model, install it locally using `pip install fbprophet`. | Accurate & fast, robust to outliers, missing data, and dramatic changes in your time series.

articles/machine-learning/how-to-configure-environment.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The following table shows each development environment covered in this article,
2222

2323
| Environment | Pros | Cons |
2424
| --- | --- | --- |
25-
| [Cloud-based Azure Machine Learning compute instance (preview)](#compute-instance) | Easiest way to get started. The entire SDK is already installed in your workspace VM, and notebook tutorials are pre-cloned and ready to run. | Lack of control over your development environment and dependencies. Additional cost incurred for Linux VM (VM can be stopped when not in use to avoid charges). See [pricing details](https://azure.microsoft.com/pricing/details/virtual-machines/linux/). |
25+
| [Cloud-based Azure Machine Learning compute instance](#compute-instance) | Easiest way to get started. The entire SDK is already installed in your workspace VM, and notebook tutorials are pre-cloned and ready to run. | Lack of control over your development environment and dependencies. Additional cost incurred for Linux VM (VM can be stopped when not in use to avoid charges). See [pricing details](https://azure.microsoft.com/pricing/details/virtual-machines/linux/). |
2626
| [Local environment](#local) | Full control of your development environment and dependencies. Run with any build tool, environment, or IDE of your choice. | Takes longer to get started. Necessary SDK packages must be installed, and an environment must also be installed if you don't already have one. |
2727
| [Azure Databricks](#aml-databricks) | Ideal for running large-scale intensive machine learning workflows on the scalable Apache Spark platform. | Overkill for experimental machine learning, or smaller-scale experiments and workflows. Additional cost incurred for Azure Databricks. See [pricing details](https://azure.microsoft.com/pricing/details/databricks/). |
2828
| [The Data Science Virtual Machine (DSVM)](#dsvm) | Similar to the cloud-based compute instance (Python and the SDK are pre-installed), but with additional popular data science and machine learning tools pre-installed. Easy to scale and combine with other custom tools and workflows. | A slower getting started experience compared to the cloud-based compute instance. |
@@ -50,7 +50,7 @@ To install the SDK environment for your [local computer](#local), [Jupyter Noteb
5050

5151
## <a id="compute-instance"></a>Your own cloud-based compute instance
5252

53-
The Azure Machine Learning [compute instance (preview)](concept-compute-instance.md) is a secure, cloud-based Azure workstation that provides data scientists with a Jupyter notebook server, JupyterLab, and a fully prepared ML environment.
53+
The Azure Machine Learning [compute instance](concept-compute-instance.md) is a secure, cloud-based Azure workstation that provides data scientists with a Jupyter notebook server, JupyterLab, and a fully prepared ML environment.
5454

5555
There is nothing to install or configure for a compute instance. Create one anytime from within your Azure Machine Learning workspace. Provide just a name and specify an Azure VM type. Try it now with this [Tutorial: Setup environment and workspace](tutorial-1st-experiment-sdk-setup.md).
5656

@@ -151,7 +151,7 @@ When you're using a local computer (which might also be a remote virtual machine
151151
152152
This example creates an environment using python 3.7.7, but any specific subversions can be chosen. SDK compatibility may not be guaranteed with certain major versions (3.5+ is recommended), and it's recommended to try a different version/subversion in your Anaconda environment if you run into errors. It will take several minutes to create the environment while components and packages are downloaded.
153153
154-
1. Run the following commands in your new environment to enable environment-specific IPython kernels. This will ensure expected kernel and package import behavior when working with Jupyter Notebooks within Anaconda environments:
154+
1. Run the following commands in your new environment to enable environment-specific I Python kernels. This will ensure expected kernel and package import behavior when working with Jupyter Notebooks within Anaconda environments:
155155
156156
```bash
157157
conda install notebook ipykernel
@@ -301,10 +301,10 @@ Once the cluster is running, [create a library](https://docs.databricks.com/user
301301
|SDK&nbsp;package&nbsp;extras|Source|PyPi&nbsp;Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|
302302
|----|---|---|
303303
|For Databricks| Upload Python Egg or PyPI | azureml-sdk[databricks]|
304-
|For Databricks -with-<br> automated ML capabilities| Upload Python Egg or PyPI | azureml-sdk[automl]|
304+
|For Databricks -with-<br> automated ML capabilities| Upload Python Egg or PyPI | `azureml-sdk[automl]`|
305305
306306
> [!Warning]
307-
> No other SDK extras can be installed. Choose only one of the preceding options [databricks] or [automl].
307+
> No other SDK extras can be installed. Choose only one of the preceding options [`databricks`] or [`automl`].
308308
309309
* Do not select **Attach automatically to all clusters**.
310310
* Select **Attach** next to your cluster name.

0 commit comments

Comments
 (0)