You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Create an Azure Machine Learning compute cluster
@@ -58,7 +58,7 @@ Compute clusters can run jobs securely in a [virtual network environment](how-to
58
58
59
59
* Some of the scenarios listed in this document are marked as __preview__. Preview functionality is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
60
60
61
-
* Compute clusters can be created in a different region than your workspace. This functionality isin __preview__, andis only available for __compute clusters__, not compute instances. This preview isnotavailable if you are using a private endpoint-enabled workspace.
61
+
* Compute clusters can be created in a different region than your workspace. This functionality isin __preview__, andis only available for __compute clusters__, not compute instances. This preview isn't available if you're using a private endpoint-enabled workspace.
62
62
63
63
> [!WARNING]
64
64
> When using a compute cluster in a different region than your workspace or datastores, you may see increased network latency and data transfer costs. The latency and costs can occur when creating the cluster, and when running jobs on it.
@@ -67,7 +67,7 @@ Compute clusters can run jobs securely in a [virtual network environment](how-to
67
67
68
68
* Azure Machine Learning Compute has default limits, such as the number of cores that can be allocated. For more information, see [Manage and request quotas for Azure resources](how-to-manage-quotas.md).
69
69
70
-
* Azure allows you to place _locks_ on resources, so that they cannot be deleted or are read only. __Do not apply resource locks to the resource group that contains your workspace__. Applying a lock to the resource group that contains your workspace will prevent scaling operations for Azure ML compute clusters. For more information on locking resources, see [Lock resources to prevent unexpected changes](../azure-resource-manager/management/lock-resources.md).
70
+
* Azure allows you to place _locks_ on resources, so that they can't be deleted or are read only. __Do not apply resource locks to the resource group that contains your workspace__. Applying a lock to the resource group that contains your workspace will prevent scaling operations for Azure ML compute clusters. For more information on locking resources, see [Lock resources to prevent unexpected changes](../azure-resource-manager/management/lock-resources.md).
71
71
72
72
> [!TIP]
73
73
> Clusters can generally scale up to 100 nodes aslongas you have enough quota for the number of cores required. By default clusters are setup with inter-node communication enabled between the nodes of the cluster to support MPI jobs for example. However you can scale your clusters to 1000s of nodes by simply [raising a support ticket](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/newsupportrequest), and requesting to allow list your subscription, or workspace, or a specific cluster for disabling inter-node communication.
@@ -86,7 +86,6 @@ The compute autoscales down to zero nodes when it isn't used. Dedicated VMs ar
86
86
87
87
# [Python](#tab/python)
88
88
89
-
90
89
To create a persistent Azure Machine Learning Compute resource in Python, specify the **vm_size**and**max_nodes** properties. Azure Machine Learning then uses smart defaults for the other properties.
91
90
92
91
***vm_size**: The VM family of the nodes created by Azure Machine Learning Compute.
@@ -120,13 +119,59 @@ Where the file *create-cluster.yml* is:
120
119
121
120
# [Studio](#tab/azure-studio)
122
121
123
-
For information on creating a compute cluster in the studio, see [Create compute targets in Azure Machine Learning studio](how-to-create-attach-compute-studio.md#amlcompute).
122
+
Create a single-or multi- node compute cluster for your training, batch inferencing or reinforcement learning workloads.
123
+
124
+
1. Navigate to [Azure Machine Learning studio](https://ml.azure.com).
125
+
126
+
1. Under __Manage__, select __Compute__.
127
+
1. If you have no compute resources, select **Create**in the middle of the page.
128
+
129
+
:::image type="content"source="media/how-to-create-attach-studio/create-compute-target.png" alt-text="Screenshot that shows creating a compute target":::
130
+
131
+
1. If you see a list of compute resources, select **+New** above the list.
1. In the tabs at the top, select __Compute cluster__
136
+
137
+
1. Fill out the form as follows:
138
+
139
+
|Field |Description |
140
+
|---------|---------|
141
+
| Location | The Azure region where the compute cluster will be created. By default, this is the same location as the workspace. Setting the location to a different region than the workspace isin __preview__, andis only available for __compute clusters__, not compute instances.</br>When using a different region than your workspace or datastores, you may see increased network latency and data transfer costs. The latency and costs can occur when creating the cluster, and when running jobs on it. |
142
+
|Virtual machine type| Choose CPUorGPU. This type can't be changed after creation |
143
+
|Virtual machine priority | Choose **Dedicated**or**Low priority**. Low priority virtual machines are cheaper but don't guarantee the compute nodes. Your job may be preempted.
144
+
|Virtual machine size | Supported virtual machine sizes might be restricted in your region. Check the [availability list](https://azure.microsoft.com/global-infrastructure/services/?products=virtual-machines) |
145
+
146
+
1. Select **Next** to proceed to **Advanced Settings**and fill out the form as follows:
147
+
148
+
|Field |Description |
149
+
|---------|---------|
150
+
|Compute name |* Name is required and must be between 3 to 24 characters long.<br><br>* Valid characters are upper and lower case letters, digits, and the **-** character.<br><br>* Name must start with a letter<br><br>* Name needs to be unique across all existing computes within an Azure region. You'll see an alert if the name you choose isn't unique<br><br>* If **-** character is used, then it needs to be followed by at least one letter later in the name |
151
+
|Minimum number of nodes | Minimum number of nodes that you want to provision. If you want a dedicated number of nodes, set that count here. Save money by setting the minimum to 0, so you won't pay for any nodes when the cluster is idle. |
152
+
|Maximum number of nodes | Maximum number of nodes that you want to provision. The compute will autoscale to a maximum of this node count when a job is submitted. |
153
+
| Idle seconds before scale down | Idle time before scaling the cluster down to the minimum node count. |
154
+
| Enable SSH access | Use the same instructions as [Enable SSH access](#enable-ssh-access) for a compute instance (above). |
155
+
|Advanced settings | Optional. Configure a virtual network. Specify the **Resource group**, **Virtual network**, and**Subnet** to create the compute instance inside an Azure Virtual Network (vnet). For more information, see these [network requirements](./how-to-secure-training-vnet.md) for vnet. Also attach [managed identities](#set-up-managed-identity) to grant access to resources.
156
+
157
+
1. Select __Create__.
158
+
159
+
160
+
### Enable SSH access
161
+
162
+
SSH access is disabled by default. SSH access can't be changed after creation. Make sure to enable access if you plan to debug interactively with [VS Code Remote](how-to-set-up-vs-code-remote.md).
## <a id="low-pri-vm"></a> Lower your compute cluster cost
172
+
## Lower your compute cluster cost
128
173
129
-
You may also choose to use [low-priority VMs](how-to-manage-optimize-cost.md#low-pri-vm) to run some or all of your workloads. These VMs do not have guaranteed availability and may be preempted while in use. You will have to restart a preempted job.
174
+
You may also choose to use [low-priority VMs](how-to-manage-optimize-cost.md#low-pri-vm) to run some or all of your workloads. These VMs don't have guaranteed availability and may be preempted while in use. You'll have to restart a preempted job.
130
175
131
176
Use any of these ways to specify a low-priority VM:
132
177
@@ -249,7 +294,7 @@ To update an existing cluster:
249
294
250
295
# [Studio](#tab/azure-studio)
251
296
252
-
See [Set up managed identityin studio](how-to-create-attach-compute-studio.md#managed-identity).
297
+
During cluster creation or when editing compute cluster details, in the **Advanced settings**, toggle **Assign a managed identity**and specify a system-assigned identity or user-assigned identity.
253
298
254
299
---
255
300
@@ -261,7 +306,7 @@ See [Set up managed identity in studio](how-to-create-attach-compute-studio.md#m
261
306
262
307
## Troubleshooting
263
308
264
-
Thereis a chance that some users who created their Azure Machine Learning workspace from the Azure portal before the GA release might not be able to create AmlCompute in that workspace. You can either raise a support request against the service or create a new workspace through the portal or the SDK to unblock yourself immediately.
309
+
There's a chance that some users who created their Azure Machine Learning workspace from the Azure portal before the GA release might not be able to create AmlCompute in that workspace. You can either raise a support request against the service or create a new workspace through the portal or the SDK to unblock yourself immediately.
0 commit comments