Skip to content

Commit 63614fa

Browse files
committed
Freshness/editing pass for Use GPUs on AKS
1 parent 471ffb9 commit 63614fa

File tree

1 file changed

+15
-21
lines changed

1 file changed

+15
-21
lines changed

articles/aks/gpu-cluster.md

Lines changed: 15 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Use GPUs on Azure Kubernetes Service (AKS)
33
description: Learn how to use GPUs for high performance compute or graphics-intensive workloads on Azure Kubernetes Service (AKS).
44
ms.topic: article
55
ms.custom: event-tier1-build-2022, devx-track-azurecli
6-
ms.date: 04/06/2023
6+
ms.date: 04/07/2023
77
#Customer intent: As a cluster administrator or developer, I want to create an AKS cluster that can use high-performance GPU-based VMs for compute-intensive workloads.
88
---
99

@@ -18,7 +18,7 @@ This article helps you provision nodes with schedulable GPUs on new and existing
1818
1919
## Before you begin
2020

21-
* This article assumes you have an existing AKS cluster. If you need to create one, you can do so using [Azure CLI][aks-quickstart-cli], [Azure PowerShell][aks-quickstart-powershell], or the [Azure portal][aks-quickstart-portal].
21+
* This article assumes you have an existing AKS cluster. If you don't have a cluster, create one using the [Azure CLI][aks-quickstart-cli], [Azure PowerShell][aks-quickstart-powershell], or the [Azure portal][aks-quickstart-portal].
2222
* You also need the Azure CLI version 2.0.64 or later installed and configured. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI][install-azure-cli].
2323

2424
## Get the credentials for your cluster
@@ -347,7 +347,7 @@ To see the GPU in action, you can schedule a GPU-enabled workload with the appro
347347
348348
## Use Container Insights to monitor GPU usage
349349
350-
The following metrics are available for [Container Insights with AKS][aks-container-insights] to monitor GPU usage.
350+
[Container Insights with AKS][aks-container-insights] monitors the following GPU usage metrics:
351351
352352
| Metric name | Metric dimension (tags) | Description |
353353
|-------------|-------------------------|-------------|
@@ -361,30 +361,26 @@ The following metrics are available for [Container Insights with AKS][aks-contai
361361
362362
## Clean up resources
363363
364-
To remove the associated Kubernetes objects created in this article, use the [kubectl delete job][kubectl delete] command as follows:
364+
* Remove the associated Kubernetes objects you created in this article using the [`kubectl delete job`][kubectl delete] command.
365365
366-
```console
367-
kubectl delete jobs samples-tf-mnist-demo
368-
```
366+
```console
367+
kubectl delete jobs samples-tf-mnist-demo
368+
```
369369
370370
## Next steps
371371
372-
To run Apache Spark jobs, see [Run Apache Spark jobs on AKS][aks-spark].
373-
374-
For more information about running machine learning (ML) workloads on Kubernetes, see [Kubeflow Labs][kubeflow-labs].
375-
376-
For more information on features of the Kubernetes scheduler, see [Best practices for advanced scheduler features in AKS][advanced-scheduler-aks].
377-
378-
For information on using Azure Kubernetes Service with Azure Machine Learning, see the following articles:
379-
380-
* [Configure a Kubernetes cluster for ML model training or deployment][azureml-aks].
381-
* [Deploy a model with an online endpoint][azureml-deploy].
382-
* [High-performance serving with Triton Inference Server][azureml-triton].
372+
* To run Apache Spark jobs, see [Run Apache Spark jobs on AKS][aks-spark].
373+
* For more information on features of the Kubernetes scheduler, see [Best practices for advanced scheduler features in AKS][advanced-scheduler-aks].
374+
* For more information on Azure Kubernetes Service and Azure Machine Learning, see:
375+
* [Configure a Kubernetes cluster for ML model training or deployment][azureml-aks].
376+
* [Deploy a model with an online endpoint][azureml-deploy].
377+
* [High-performance serving with Triton Inference Server][azureml-triton].
378+
* [Deploy machine learning models to AKS with Kubeflow][kubeflow].
383379
384380
<!-- LINKS - external -->
385381
[kubectl-apply]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#apply
386382
[kubectl-get]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get
387-
[kubeflow-labs]: https://github.com/Azure/kubeflow-labs
383+
[kubeflow]: ../../../architecture-center-pr/docs/solution-ideas/articles/machine-learning-model-deployment-aks-content.md
388384
[kubectl-describe]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#describe
389385
[kubectl-logs]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#logs
390386
[kubectl delete]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#delete
@@ -394,8 +390,6 @@ For information on using Azure Kubernetes Service with Azure Machine Learning, s
394390
[nvidia-github]: https://github.com/NVIDIA/k8s-device-plugin
395391
396392
<!-- LINKS - internal -->
397-
[az-group-create]: /cli/azure/group#az_group_create
398-
[az-aks-create]: /cli/azure/aks#az_aks_create
399393
[az-aks-nodepool-add]: /cli/azure/aks/nodepool#az_aks_nodepool_add
400394
[az-aks-get-credentials]: /cli/azure/aks#az_aks_get_credentials
401395
[aks-quickstart-cli]: ./learn/quick-kubernetes-deploy-cli.md

0 commit comments

Comments
 (0)