You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Create and manage instance types for efficient utilization of compute resources
16
16
17
-
Instance types are an Azure Machine Learning concept that allows targeting certain types of compute nodes for training and inference workloads. For an Azure virtual machine (VM), an example of an instance type is `STANDARD_D2_V3`.
17
+
Instance types are an Azure Machine Learning concept that allows targeting certain types of compute nodes for training and inference workloads. For an Azure virtual machine, an example of an instance type is `STANDARD_D2_V3`.
18
18
19
19
In Kubernetes clusters, instance types are represented in a custom resource definition (CRD) that's installed with the Azure Machine Learning extension. Two elements in the Azure Machine Learning extension represent the instance types:
20
20
@@ -24,7 +24,7 @@ In Kubernetes clusters, instance types are represented in a custom resource defi
24
24
If you [specify a nodeSelector field when deploying the Azure Machine Learning extension](./how-to-deploy-kubernetes-extension.md#review-azure-machine-learning-extension-configuration-settings), the `nodeSelector` field will be applied to all instance types. This means that:
25
25
26
26
- For each instance type that you create, the specified `nodeSelector` field should be a subset of the extension-specified `nodeSelector` field.
27
-
- If you use an instance type with `nodeSelector`, the workload will run on any node that matches both the extension-specified `nodeSelector` field and the instancetype-specified `nodeSelector` field.
27
+
- If you use an instance type with `nodeSelector`, the workload will run on any node that matches both the extension-specified `nodeSelector` field and the instance-type-specified `nodeSelector` field.
28
28
- If you use an instance type without a `nodeSelector` field, the workload will run on any node that matches the extension-specified `nodeSelector` field.
29
29
30
30
## Create a default instance type
@@ -44,12 +44,12 @@ resources:
44
44
45
45
If you don't apply a `nodeSelector` field, the pod can be scheduled on any node. The workload's pods are assigned default resources with 0.1 CPU cores, 2 GB of memory, and 0 GPUs for the request. The resources that the workload's pods use are limited to 2 CPU cores and 8 GB of memory.
46
46
47
-
The default instance type purposefully uses few resources. To ensure that all machine learning workloads run with appropriate resources (for example, GPU resource), we highly recommend that you create custom instance types.
47
+
The default instance type purposefully uses few resources. To ensure that all machine learning workloads run with appropriate resources (for example, GPU resource), we highly recommend that you [create custom instance types](#create-a-custom-instance-type).
48
48
49
49
Keep in mind the following points about the default instance type:
50
50
51
51
- `defaultinstancetype`doesn't appear as an `InstanceType` custom resource in the cluster when you're running the command ```kubectl get instancetype```, but it does appear in all clients (UI, Azure CLI, SDK).
52
-
- `defaultinstancetype`can be overridden with a [custom instance type](#create-a-custom-instance-type) definition that has the same name.
52
+
- `defaultinstancetype`can be overridden with the definition of a custominstance type that has the same name.
53
53
54
54
## Create a custom instance type
55
55
@@ -94,7 +94,7 @@ Creation of custom instance types must meet the following parameters and definit
94
94
| `Memory request` | Required | String values, which can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1,024 mebibytes (MiB).|
95
95
| `CPU limit` | Required | String values, which can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it as full numbers. For example, `"1"` is equivalent to `1000m`.|
96
96
| `Memory limit` | Required | String values, which can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1024 MiB.|
97
-
| `GPU` | Optional | Integer values, which can only be specified in the `limits` section. <br>For more information, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins). |
97
+
| `GPU` | Optional | Integer values, which can be specified only in the `limits` section. <br>For more information, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins). |
98
98
| `nodeSelector` | Optional | Map of string keys and values. |
99
99
100
100
It's also possible to create multiple instance types at once:
@@ -142,7 +142,7 @@ If you submit a training or inference workload without an instance type, it uses
To select an instance type for a training job by using the SDK (V2), specify its name for the `instance_type` property in the `command` class. For example:
159
+
To select an instance type for a training job by using the SDK (v2), specify its name for the `instance_type` property in the `command` class. For example:
In the preceding example, replace `<Kubernetes-compute_target_name>` with the name of your Kubernetes compute target. Replace `<instance_type_name>` with the name of the instance type that you want to select. If you don't specify an `instance_type` property, the system uses `defaultinstancetype` to submit the job.
175
+
In the preceding example, replace `<Kubernetes-compute_target_name>` with the name of your Kubernetes compute target. Replace `<instance type name>` with the name of the instance type that you want to select. If you don't specify an `instance_type` property, the system uses `defaultinstancetype` to submit the job.
To select an instance type for a model deployment by using the Azure CLI (V2), specify its name for the `instance_type` property in the deployment YAML. For example:
181
+
To select an instance type for a model deployment by using the Azure CLI (v2), specify its name for the `instance_type` property in the deployment YAML. For example:
To select an instance type for a model deployment by using the SDK (V2), specify its name for the `instance_type` property in the `KubernetesOnlineDeployment` class. For example:
200
+
To select an instance type for a model deployment by using the SDK (v2), specify its name for the `instance_type` property in the `KubernetesOnlineDeployment` class. For example:
201
201
202
202
```python
203
203
from azure.ai.ml import KubernetesOnlineDeployment,Model,Environment,CodeConfiguration
In the preceding example, replace `<instance_type_name>` with the name of the instance type that you want to select. If you don't specify an `instance_type` property, the system uses `defaultinstancetype` to deploy the model.
227
+
In the preceding example, replace `<instance type name>` with the name of the instance type that you want to select. If you don't specify an `instance_type` property, the system uses `defaultinstancetype` to deploy the model.
228
228
229
229
> [!IMPORTANT]
230
230
> For MLflow model deployment, the resource request requires at least 2 CPU cores and 4 GB of memory. Otherwise, the deployment will fail.
0 commit comments