Skip to content

Commit 469f70f

Browse files
authored
Merge pull request #32 from yuyue9284/fix-kubernetes-extension-tsg
Update Kubernetes extension troubleshooting guide
2 parents 17fc548 + 5d41a55 commit 469f70f

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

articles/machine-learning/how-to-troubleshoot-kubernetes-extension.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -257,7 +257,6 @@ volcano-scheduler.conf: |
257257
- plugins:
258258
- name: conformance
259259
- plugins:
260-
- name: overcommit
261260
- name: drf
262261
- name: predicates
263262
- name: proportion
@@ -269,13 +268,15 @@ To use this config in your AKS cluster, you need to follow the following steps:
269268
1. Create a configmap file with the above config in the `azureml` namespace. This namespace will generally be created when you install the Azure Machine Learning extension.
270269
1. Set `volcanoScheduler.schedulerConfigMap=<configmap name>` in the extension config to apply this configmap. And you need to skip the resource validation when installing the extension by configuring `amloperator.skipResourceValidation=true`. For example:
271270
```azurecli
272-
az k8s-extension update --name <extension-name> --extension-type Microsoft.AzureML.Kubernetes --config volcanoScheduler.schedulerConfigMap=<configmap name> amloperator.skipResourceValidation=true --cluster-type managedClusters --cluster-name <your-AKS-cluster-name> --resource-group <your-RG-name> --scope cluster
271+
az k8s-extension update --name <extension-name> --config volcanoScheduler.schedulerConfigMap=<configmap name> amloperator.skipResourceValidation=true --cluster-type managedClusters --cluster-name <your-AKS-cluster-name> --resource-group <your-RG-name>
273272
```
274273
275274
> [!NOTE]
276275
> Since the gang plugin is removed, there's potential that the deadlock happens when volcano schedules the job.
277276
>
278277
> * To avoid this situation, you can **use same instance type across the jobs**.
278+
>
279+
> Using a scheduler configuration other than the default provided by the Azure Machine Learning extension may not be fully supported. Proceed with caution.
279280
>
280281
> Note that you need to disable `job/validate` webhook in the volcano admission if your **volcano version is lower than 1.6**.
281282

0 commit comments

Comments
 (0)