Merge pull request #6175 from s-polly/stp-k8s

prmerger-automator[bot] · web-flow · commit d99509d19f9c · 2025-07-23T17:20:55.000Z
Freshness check, k8s instance types
diff --git a/articles/machine-learning/how-to-manage-kubernetes-instance-types.md b/articles/machine-learning/how-to-manage-kubernetes-instance-types.md
@@ -1,31 +1,31 @@
 ---
 title: Create and manage instance types for efficient utilization of compute resources
-description: Learn about what instance types are, how to create and manage them, and what the benefits of using them are.
+description: Learn what instance types are, how to create and manage them, and the benefits of using them.
 titleSuffix: Azure Machine Learning
 author: s-polly
 ms.author: scottpolly
-ms.reviewer: bozhlin
+ms.reviewer: namanjoshi
 ms.service: azure-machine-learning
 ms.subservice: core
-ms.date: 01/09/2024
+ms.date: 07/23/2025
 ms.topic: how-to
 ms.custom: build-spring-2022, cliv2, sdkv2
 ---
 
 # Create and manage instance types for efficient utilization of compute resources
 
-Instance types are an Azure Machine Learning concept that allows targeting certain types of compute nodes for training and inference workloads. For example, in an Azure virtual machine, an instance type is `STANDARD_D2_V3`. This article teaches you how to create and manage instance types for your computation requirements. 
+Instance types are an Azure Machine Learning concept that allows targeting certain types of compute nodes for training and inference workloads. For example, in an Azure virtual machine, an instance type is `STANDARD_D2_V3`. This article shows you how to create and manage instance types for your computation requirements. 
 
-In Kubernetes clusters, instance types are represented in a custom resource definition (CRD) that's installed with the Azure Machine Learning extension. Two elements in the Azure Machine Learning extension represent the instance types:
+In Kubernetes clusters, instance types are represented as a custom resource definition (CRD) installed with the Azure Machine Learning extension. Two elements in the Azure Machine Learning extension represent instance types:
 
-- Use [nodeSelector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) to specify which node a pod should run on. The node must have a corresponding label.
-- In the [resources](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) section, you can set the compute resources (CPU, memory, and NVIDIA GPU) for the pod.
+- **nodeSelector**: Use [nodeSelector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) to specify which node a pod should run on. The node must have a corresponding label.
+- **resources**: In the [resources](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) section, you can set the compute resources (CPU, memory, and NVIDIA GPU) for the pod.
 
-If you [specify a nodeSelector field when deploying the Azure Machine Learning extension](./how-to-deploy-kubernetes-extension.md#review-azure-machine-learning-extension-configuration-settings), the `nodeSelector` field will be applied to all instance types. This means that:
+If you [specify a nodeSelector field when deploying the Azure Machine Learning extension](./how-to-deploy-kubernetes-extension.md#review-azure-machine-learning-extension-configuration-settings), the `nodeSelector` field applies to all instance types. This means:
 
 - For each instance type that you create, the specified `nodeSelector` field should be a subset of the extension-specified `nodeSelector` field.
-- If you use an instance type with `nodeSelector`, the workload will run on any node that matches both the extension-specified `nodeSelector` field and the instance-type-specified `nodeSelector` field.
-- If you use an instance type without a `nodeSelector` field, the workload will run on any node that matches the extension-specified `nodeSelector` field.
+- If you use an instance type with `nodeSelector`, the workload runs on any node that matches both the extension-specified `nodeSelector` field and the instance-type-specified `nodeSelector` field.
+- If you use an instance type without a `nodeSelector` field, the workload runs on any node that matches the extension-specified `nodeSelector` field.
 
 ## Create a default instance type
 
@@ -44,11 +44,11 @@ resources:
 
 If you don't apply a `nodeSelector` field, the pod can be scheduled on any node. The workload's pods are assigned default resources with 0.1 CPU cores, 2 GB of memory, and 0 GPUs for the request. The resources that the workload's pods use are limited to 2 CPU cores and 8 GB of memory.
 
-The default instance type purposefully uses few resources. To ensure that all machine learning workloads run with appropriate resources (for example, GPU resource), we highly recommend that you [create custom instance types](#create-a-custom-instance-type).
+The default instance type purposefully uses minimal resources. To ensure that all machine learning workloads run with appropriate resources (for example, GPU resources), we highly recommend that you [create custom instance types](#create-a-custom-instance-type).
 
 Keep in mind the following points about the default instance type:
 
-- `defaultinstancetype` doesn't appear as an `InstanceType` custom resource in the cluster when you're running the command ```kubectl get instancetype```, but it does appear in all clients (UI, Azure CLI, SDK).
+- `defaultinstancetype` doesn't appear as an `InstanceType` custom resource in the cluster when you run the command `kubectl get instancetype`, but it does appear in all clients (UI, Azure CLI, SDK).
 - `defaultinstancetype` can be overridden with the definition of a custom instance type that has the same name.
 
 ## Create a custom instance type
@@ -79,25 +79,25 @@ spec:
       memory: "1500Mi"
 ```
 
-The preceding code creates an instance type with the labeled behavior:
+The preceding code creates an instance type with the following behavior:
 
 - Pods are scheduled only on nodes that have the label `mylabel: mylabelvalue`.
 - Pods are assigned resource requests of `700m` for CPU and `1500Mi` for memory.
 - Pods are assigned resource limits of `1` for CPU, `2Gi` for memory, and `1` for NVIDIA GPU.
 
-Creation of custom instance types must meet the following parameters and definition rules, or it fails:
+Custom instance type creation must meet the following parameters and definition rules, or it fails:
 
 | Parameter | Required or optional | Description |
 | --- | --- | --- |
-| `name` | Required | String values, which must be unique in a cluster.|
-| `CPU request` | Required | String values, which can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it as full numbers. For example, `"1"` is equivalent to `1000m`.|
-| `Memory request` | Required | String values, which can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1,024 mebibytes (MiB).|
-| `CPU limit` | Required | String values, which can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it as full numbers. For example, `"1"` is equivalent to `1000m`.|
-| `Memory limit` | Required | String values, which can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1024 MiB.|
-| `GPU` | Optional | Integer values, which can be specified only in the `limits` section. <br>For more information, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins). |
+| `name` | Required | String values that must be unique in a cluster.|
+| `CPU request` | Required | String values that can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it as full numbers. For example, `"1"` is equivalent to `1000m`.|
+| `Memory request` | Required | String values that can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1,024 mebibytes (MiB).|
+| `CPU limit` | Required | String values that can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it as full numbers. For example, `"1"` is equivalent to `1000m`.|
+| `Memory limit` | Required | String values that can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1024 MiB.|
+| `GPU` | Optional | Integer values that can be specified only in the `limits` section. <br>For more information, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins). |
 | `nodeSelector` | Optional | Map of string keys and values. |
 
-It's also possible to create multiple instance types at once:
+You can also create multiple instance types at once:
 
 ```bash
 kubectl apply -f my_instance_type_list.yaml
@@ -142,8 +142,7 @@ If you submit a training or inference workload without an instance type, it uses
 
 ### [Azure CLI](#tab/select-instancetype-to-trainingjob-with-cli)
 
-To select an instance type for a training job by using the Azure CLI (v2), specify its name as part of the
-`resources` properties section in the job YAML. For example:
+To select an instance type for a training job using the Azure CLI (v2), specify its name as part of the `resources` properties section in the job YAML. For example:
 
 ```yaml
 command: python -c "print('Hello world!')"
@@ -156,14 +155,14 @@ resources:
 
 ### [Python SDK](#tab/select-instancetype-to-trainingjob-with-sdk)
 
-To select an instance type for a training job by using the SDK (v2), specify its name for the `instance_type` property in the `command` class. For example:
+To select an instance type for a training job using the SDK (v2), specify its name for the `instance_type` property in the `command` class. For example:
 
 ```python
 from azure.ai.ml import command
 
 # define the command
 command_job = command(
-    command="python -c "print('Hello world!')"",
+    command="python -c  print('Hello world!')"",
     environment="AzureML-lightgbm-3.2-ubuntu18.04-py37-cpu@latest",
     compute="<Kubernetes-compute_target_name>",
     instance_type="<instance type name>"
@@ -178,7 +177,7 @@ In the preceding example, replace `<Kubernetes-compute_target_name>` with the na
 
 ### [Azure CLI](#tab/select-instancetype-to-modeldeployment-with-cli)
 
-To select an instance type for a model deployment by using the Azure CLI (v2), specify its name for the `instance_type` property in the deployment YAML. For example:
+To select an instance type for a model deployment using the Azure CLI (v2), specify its name for the `instance_type` property in the deployment YAML. For example:
 
 ```yaml
 name: blue
@@ -197,7 +196,7 @@ environment:
 
 ### [Python SDK](#tab/select-instancetype-to-modeldeployment-with-sdk)
 
-To select an instance type for a model deployment by using the SDK (v2), specify its name for the `instance_type` property in the `KubernetesOnlineDeployment` class. For example:
+To select an instance type for a model deployment using the SDK (v2), specify its name for the `instance_type` property in the `KubernetesOnlineDeployment` class. For example:
 
 ```python
 from azure.ai.ml import KubernetesOnlineDeployment,Model,Environment,CodeConfiguration
@@ -227,11 +226,11 @@ blue_deployment = KubernetesOnlineDeployment(
 In the preceding example, replace `<instance type name>` with the name of the instance type that you want to select. If you don't specify an `instance_type` property, the system uses `defaultinstancetype` to deploy the model.
 
 > [!IMPORTANT]
-> For MLflow model deployment, the resource request requires at least 2 CPU cores and 4 GB of memory. Otherwise, the deployment will fail.
+> For MLflow model deployment, the resource request requires at least 2 CPU cores and 4 GB of memory. Otherwise, the deployment fails.
 
 ### Resource section validation
 
-You can use the `resources` section to define the resource request and limit of your model deployments. For example:
+Use the `resources` section to define the resource request and limit for your model deployments. For example:
 
 #### [Azure CLI](#tab/define-resource-to-modeldeployment-with-cli)
 
@@ -297,19 +296,19 @@ blue_deployment = KubernetesOnlineDeployment(
 
 ---
 
-If you use the `resources` section, a valid resource definition needs to meet the following rules. An invalid resource definition causes the model deployment to fail.
+When you use the `resources` section, a valid resource definition must meet the following rules. An invalid resource definition causes the model deployment to fail.
 
 | Parameter | Required or optional | Description |
 | --- | --- | --- |
-| `requests:`<br>`cpu:`| Required | String values, which can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it in full numbers. For example, `"1"` is equivalent to `1000m`.|
-| `requests:`<br>`memory:` | Required | String values, which can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1024 MiB. <br>Memory can't be less than 1 MB.|
-| `limits:`<br>`cpu:` | Optional <br>(required only when you need GPU) | String values, which can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it in full numbers. For example, `"1"` is equivalent to `1000m`. |
-| `limits:`<br>`memory:` | Optional <br>(required only when you need GPU) | String values, which can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1,024 MiB.|
-| `limits:`<br>`nvidia.com/gpu:` | Optional <br>(required only when you need GPU) | Integer values, which can't be empty and can be specified only in the `limits` section. <br>For more information, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins). <br>If you require CPU only, you can omit the entire `limits` section.|
+| `requests:`<br>`cpu:`| Required | String values that can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it in full numbers. For example, `"1"` is equivalent to `1000m`.|
+| `requests:`<br>`memory:` | Required | String values that can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1024 MiB. <br>Memory can't be less than 1 MB.|
+| `limits:`<br>`cpu:` | Optional <br>(required only when you need GPU) | String values that can't be zero or empty. <br>You can specify the CPU in millicores; for example, `100m`. You can also specify it in full numbers. For example, `"1"` is equivalent to `1000m`. |
+| `limits:`<br>`memory:` | Optional <br>(required only when you need GPU) | String values that can't be zero or empty. <br>You can specify the memory as a full number + suffix; for example, `1024Mi` for 1,024 MiB.|
+| `limits:`<br>`nvidia.com/gpu:` | Optional <br>(required only when you need GPU) | Integer values that can't be empty and can be specified only in the `limits` section. <br>For more information, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins). <br>If you require CPU only, you can omit the entire `limits` section.|
 
-The instance type is *required* for model deployment. If you defined the `resources` section, and it will be validated against the instance type, the rules are as follows:
+An instance type is *required* for model deployment. If you define the `resources` section, it's validated against the instance type according to the following rules:
 
-- With a valid `resource` section definition, the resource limits must be less than the instance type limits. Otherwise, deployment will fail.
+- With a valid `resource` section definition, the resource limits must be less than the instance type limits. Otherwise, deployment fails.
 - If you don't define an instance type, the system uses `defaultinstancetype` for validation with the `resources` section.
 - If you don't define the `resources` section, the system uses the instance type to create the deployment.