Skip to content

Commit 8659bbb

Browse files
committed
created mermaid diagram instead of ASCII diagram
1 parent a9d2ca2 commit 8659bbb

File tree

1 file changed

+24
-14
lines changed

1 file changed

+24
-14
lines changed

ai/vllm-deployment/hpa/README.md

Lines changed: 24 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,21 +19,31 @@ The autoscaling solution works as follows:
1919
6. The **Horizontal Pod Autoscaler (HPA)** controller queries the custom metrics API for the metrics and compares them to the target values defined in the `HorizontalPodAutoscaler` resource.
2020
7. If the metrics exceed the target, the HPA scales up the `vllm-gemma-deployment`.
2121

22+
23+
```mermaid
24+
flowchart TD
25+
D("PrometheusRule (GPU Metric Only)")
26+
B("Prometheus Server")
27+
subgraph subGraph0["Metrics Collection"]
28+
C("ServiceMonitor")
29+
A["vLLM Server"]
30+
H["GPU DCGM Exporter"]
31+
end
32+
subgraph subGraph1["HPA Scaling Logic"]
33+
E("Prometheus Adapter")
34+
F("API Server")
35+
G("HPA Controller")
36+
end
37+
A -- Scrapes Raw Metrics --> C
38+
H -- Scrapes Raw Metrics --> C
39+
C -- Configures Scrape --> B
40+
B -- Processes Raw Metrics via --> D
41+
D -- Creates Clean Metric in --> B
42+
E -- Queries Clean Metric --> B
43+
F -- Queries Custom Metric --> E
44+
G -- Queries Custom Metric --> F
2245
```
23-
┌──────────────┐ ┌────────────────┐ ┌──────────────────┐
24-
│ User Request │──>│ vLLM Server │──>│ ServiceMonitor │
25-
└──────────────┘ │ (or DCGM Exp.) │ └──────────────────┘
26-
└────────────────┘ │
27-
28-
┌────────────────┐ ┌──────────────────┐ ┌──────────────────┐
29-
│ HPA Controller │<──│ Prometheus Adpt. │<──│ Prometheus Srv. │
30-
└────────────────┘ └──────────────────┘ └──────────────────┘
31-
│ (GPU Path Only)
32-
33-
┌────────────────┐
34-
│ PrometheusRule │
35-
└────────────────┘
36-
```
46+
3747

3848
## Prerequisites
3949

0 commit comments

Comments
 (0)