RHOAIENG-26247 - AMD Grafana metrics (#873)

eturner24 · web-flow · commit bf9430a03f76 · 2025-07-29T09:31:31.000-04:00
diff --git a/modules/deploying-vllm-gpu-metrics-dashboard-grafana.adoc b/modules/deploying-vllm-gpu-metrics-dashboard-grafana.adoc
@@ -20,7 +20,9 @@ endif::[]
 .Procedure
 
 . Define a `GrafanaDashboard` object in a YAML file, similar to the following examples:
-.. To monitor accelerator metrics, see link:https://github.com/rh-aiservices-bu/rhoai-uwm/tree/main/rhoai-uwm-grafana/overlays/rhoai-uwm-user-grafana-app/nvidia-vllm-dashboard.yaml[`nvidia-vllm-dashboard.yaml`].
+.. To monitor NVIDIA accelerator metrics, see link:https://github.com/rh-aiservices-bu/rhoai-uwm/tree/main/rhoai-uwm-grafana/overlays/rhoai-uwm-user-grafana-app/nvidia-vllm-dashboard.yaml[`nvidia-vllm-dashboard.yaml`].
+.. To monitor AMD accelerator metrics, see link:https://github.com/rh-aiservices-bu/rhoai-uwm/tree/main/rhoai-uwm-grafana/overlays/rhoai-uwm-user-grafana-app/amd-vllm-dashboard.yaml[`amd-vllm-dashboard.yaml`].
+.. To monitor Intel accelerator metrics, see link:https://github.com/rh-aiservices-bu/rhoai-uwm/tree/main/rhoai-uwm-grafana/overlays/rhoai-uwm-user-grafana-app/gaudi-vllm-dashboard.yaml[`gaudi-vllm-dashboard.yaml`].
 .. To monitor vLLM metrics, see link:https://github.com/rh-aiservices-bu/rhoai-uwm/tree/main/rhoai-uwm-grafana/overlays/rhoai-uwm-user-grafana-app/grafana-vllm-dashboard.yaml[`grafana-vllm-dashboard.yaml`].
 
 . Create an `inputs.env` file similar to the following example. Replace the `NAMESPACE` and `MODEL_NAME` parameters with your own values:
@@ -33,19 +35,20 @@ MODEL_NAME=<model-name> <2>
 <1> **NAMESPACE** is the target namespace where the model will be deployed.
 <2> **MODEL_NAME** is the model name as defined in your InferenceService. The model name is also used to filter the pod name in the Grafana dashboard.
 
-. Replace the `NAMESPACE` and `MODEL_NAME` parameters in your YAML file with the values from the `input.env` file by performing the following actions:
+. Replace the `NAMESPACE` and `MODEL_NAME` parameters in your YAML file with the values from the `inputs.env` file by performing the following actions:
 
 .. Export the parameters described in the `inputs.env` as environment variables:
 +
 [source]
 ----
 export $(cat inputs.env | xargs)
 ----
-.. Replace the  `$NAMESPACE` and `${MODEL_NAME)` variables in the YAML file with the values of the exported environment variables:
+
+.. Update the following YAML file, replacing the `${NAMESPACE}` and `${MODEL_NAME}` variables with the values of the exported environment variables, and `dashboard_template.yaml` with the name of the `GrafanaDashboard` object YAML file that you created earlier:
 +
 [source]
 ----
-envsubst '${NAMESPACE} ${MODEL_NAME}' < nvidia-vllm-dashboard.yaml > nvidia-vllm-dashboard-replaced.yaml
+envsubst '${NAMESPACE} ${MODEL_NAME}' < dashboard_template.yaml > dashboard_template-replaced.yaml
 ----
 
 . Confirm that your YAML file contains updated values.
@@ -54,7 +57,7 @@ envsubst '${NAMESPACE} ${MODEL_NAME}' < nvidia-vllm-dashboard.yaml > nvidia-vllm
 +
 [source]
 ----
-oc create -f nvidia-vllm-dashboard-replaced.yaml
+oc create -f dashboard_template-replaced.yaml
 ----
 
 .Verification