Skip to content

Commit ad5633f

Browse files
authored
Fix duplicate and reorder sections
1 parent 32ce334 commit ad5633f

File tree

1 file changed

+8
-20
lines changed

1 file changed

+8
-20
lines changed

articles/machine-learning/how-to-troubleshoot-online-endpoints.md

Lines changed: 8 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -265,10 +265,6 @@ This issue happens when the size of the model is larger than the available disk
265265
#### Memory quota
266266
This issue happens when the memory footprint of the model is larger than the available memory. Try a [SKU](reference-managed-online-endpoints-vm-sku-list.md) with more memory.
267267

268-
#### Endpoint quota
269-
270-
Try to delete some unused endpoints in this subscription. If all of your endpoints are actively in use, you can try [requesting an endpoint quota increase](how-to-manage-quotas.md#endpoint-quota-increases).
271-
272268
#### Role assignment quota
273269

274270
When you are creating a managed online endpoint, role assignment is required for the [managed identity](../active-directory/managed-identities-azure-resources/overview.md) to access workspace resources. If you've reached the [role assignment limit](../azure-resource-manager/management/azure-subscription-service-limits.md#azure-rbac-limits), try to delete some unused role assignments in this subscription. You can check all role assignments in the Azure portal by navigating to the Access Control menu.
@@ -281,22 +277,6 @@ Try to delete some unused endpoints in this subscription. If all of your endpoin
281277

282278
Due to a lack of Azure Machine Learning capacity in the region, the service has failed to provision the specified VM size. Retry later or try deploying to a different region.
283279

284-
#### Endpoint quota
285-
286-
Try to delete some unused endpoints in this subscription. If all of your endpoints are actively in use, you can try [requesting an endpoint quota increase](how-to-manage-quotas.md#endpoint-quota-increases).
287-
288-
#### Region-wide VM capacity
289-
290-
Due to a lack of Azure Machine Learning capacity in the region, the service has failed to provision the specified VM size. Retry later or try deploying to a different region.
291-
292-
#### Kubernetes quota
293-
294-
This issue happens when the requested CPU or memory couldn't be satisfied due to all nodes are unschedulable for this deployment, such as nodes are cordoned or nodes are unavailable.
295-
296-
The error message will typically indicate which resource you need more of. For instance, if you see an error message detailing `0/3 nodes are available: 3 Insufficient nvidia.com/gpu`, that means that the service requires GPUs and there are three nodes in the cluster that don't have sufficient GPUs. This can be addressed by adding more nodes if you're using a GPU SKU, switching to a GPU-enabled SKU if you aren't, or changing your environment to not require GPUs.
297-
298-
You can also try adjusting your request in the cluster, you can directly [adjust the resource request of the instance type](how-to-manage-kubernetes-instance-types.md).
299-
300280
#### Other quota
301281

302282
To run the `score.py` provided as part of the deployment, Azure creates a container that includes all the resources that the `score.py` needs, and runs the scoring script on that container.
@@ -329,6 +309,14 @@ Use the **Endpoints** in the studio:
329309
1. Select the **Deployment logs** tab in the endpoint's details page.
330310
1. Use the dropdown to select the deployment whose log you want to see.
331311

312+
#### Kubernetes quota
313+
314+
This issue happens when the requested CPU or memory couldn't be satisfied due to all nodes are unschedulable for this deployment, such as nodes are cordoned or nodes are unavailable.
315+
316+
The error message will typically indicate which resource you need more of. For instance, if you see an error message detailing `0/3 nodes are available: 3 Insufficient nvidia.com/gpu`, that means that the service requires GPUs and there are three nodes in the cluster that don't have sufficient GPUs. This can be addressed by adding more nodes if you're using a GPU SKU, switching to a GPU-enabled SKU if you aren't, or changing your environment to not require GPUs.
317+
318+
You can also try adjusting your request in the cluster, you can directly [adjust the resource request of the instance type](how-to-manage-kubernetes-instance-types.md).
319+
332320
---
333321

334322
### ERROR: BadArgument

0 commit comments

Comments
 (0)