Skip to content

Commit 5d90816

Browse files
samzongtao12345666333rootfs
authored
docs(router.md): add error metrics and example queries for llm_request_errors_total (#156)
* docs(router): add error metrics and example queries for llm_request_errors_total Signed-off-by: samzong <[email protected]> * Update website/docs/api/router.md Co-authored-by: Jintao Zhang <[email protected]> Signed-off-by: samzong <[email protected]> --------- Signed-off-by: samzong <[email protected]> Co-authored-by: Jintao Zhang <[email protected]> Co-authored-by: Huamin Chen <[email protected]>
1 parent 3f48e37 commit 5d90816

File tree

1 file changed

+35
-0
lines changed

1 file changed

+35
-0
lines changed

website/docs/api/router.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,12 @@ semantic_router_cache_size 1247
173173
# Security metrics
174174
semantic_router_pii_detections_total{action="block"} 23
175175
semantic_router_jailbreak_attempts_total{action="block"} 5
176+
177+
# Error metrics
178+
llm_request_errors_total{model="gpt-4",reason="timeout"} 12
179+
llm_request_errors_total{model="claude-3",reason="upstream_5xx"} 3
180+
llm_request_errors_total{model="phi4",reason="upstream_4xx"} 5
181+
llm_request_errors_total{model="phi4",reason="pii_policy_denied"} 8
176182
```
177183

178184
### Reasoning Mode Metrics
@@ -247,6 +253,35 @@ sum by (model) (increase(llm_model_cost_total{currency="USD"}[1h]))
247253
sum by (reason_code) (increase(llm_routing_reason_codes_total[15m]))
248254
```
249255

256+
### Request Error Metrics
257+
258+
The router tracks request-level failures by model and reason so you can monitor both absolute error throughput and the share of requests that fail.
259+
260+
- `llm_request_errors_total{model, reason}`
261+
- Description: Total number of request errors categorized by failure reason
262+
- Labels:
263+
- model: target model name for the failed request
264+
- reason: error category (timeout, upstream_4xx, upstream_5xx, pii_policy_denied, jailbreak_block, parse_error, serialization_error, cancellation, classification_failed, unknown)
265+
266+
Example PromQL queries:
267+
268+
```prometheus
269+
# Total errors by reason over the last hour
270+
sum by (reason) (increase(llm_request_errors_total[1h]))
271+
272+
# Error throughput (errors/sec) by model over the last 15 minutes.
273+
# Helpful for incident response because it shows how many failing requests are impacting users.
274+
sum by (model) (rate(llm_request_errors_total[15m]))
275+
276+
# Error ratio (% of requests failing) by model over the last 15 minutes.
277+
# Use increase() to align numerator and denominator with the same lookback window.
278+
100 * sum by (model) (increase(llm_request_errors_total[15m])) /
279+
sum by (model) (increase(llm_model_requests_total[15m]))
280+
281+
# PII policy blocks over the last 24 hours
282+
sum(increase(llm_request_errors_total{reason="pii_policy_denied"}[24h]))
283+
```
284+
250285
### Pricing Configuration
251286

252287
Provide per-1M pricing for your models so the router can compute request cost and emit metrics/logs.

0 commit comments

Comments
 (0)