Skip to content

Commit 1030c21

Browse files
authored
chore: Refine README, adjust image size (#88)
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
1 parent c8cdf9d commit 1030c21

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

README.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@ The stack is set up using [Helm](https://helm.sh/docs/), and contains the follow
3131
- **Request router**: Directs requests to appropriate backends based on routing keys or session IDs to maximize KV cache reuse.
3232
- **Observability stack**: monitors the metrics of the backends through [Prometheus](https://github.com/prometheus/prometheus) + [Grafana](https://grafana.com/)
3333

34-
<img src="https://github.com/user-attachments/assets/8f05e7b9-0513-40a9-9ba9-2d3acca77c0c" alt="Architecture of the stack" width="800"/>
34+
<p align="center">
35+
<img src="https://github.com/user-attachments/assets/8f05e7b9-0513-40a9-9ba9-2d3acca77c0c" alt="Architecture of the stack" width="80%"/>
36+
</p>
3537

3638
## Roadmap
3739

@@ -86,16 +88,16 @@ The Grafana dashboard provides the following insights:
8688
6. **GPU KV Usage Percent**: Monitors GPU KV cache usage.
8789
7. **GPU KV Cache Hit Rate**: Displays the hit rate for the GPU KV cache.
8890

89-
<img src="https://github.com/user-attachments/assets/05766673-c449-4094-bdc8-dea6ac28cb79" alt="Grafana dashboard to monitor the deployment" width="500"/>
91+
<p align="center">
92+
<img src="https://github.com/user-attachments/assets/05766673-c449-4094-bdc8-dea6ac28cb79" alt="Grafana dashboard to monitor the deployment" width="80%"/>
93+
</p>
9094

9195
### Configuration
9296

93-
See the details in `observability/README.md`
97+
See the details in [`observability/README.md`](./observability/README.md)
9498

9599
## Router
96100

97-
### Overview
98-
99101
The router ensures efficient request distribution among backends. It supports:
100102

101103
- Routing to endpoints that run different models
@@ -106,6 +108,8 @@ The router ensures efficient request distribution among backends. It supports:
106108
- Session-ID based routing
107109
- (WIP) prefix-aware routing
108110

111+
Please refer to the [router documentation](./router/README.md) for more details.
112+
109113
## Contributing
110114

111115
Contributions are welcome! Please follow the standard GitHub flow:

0 commit comments

Comments
 (0)