@@ -173,6 +173,12 @@ semantic_router_cache_size 1247
173173# Security metrics
174174semantic_router_pii_detections_total{action="block"} 23
175175semantic_router_jailbreak_attempts_total{action="block"} 5
176+
177+ # Error metrics
178+ llm_request_errors_total{model="gpt-4",reason="timeout"} 12
179+ llm_request_errors_total{model="claude-3",reason="upstream_5xx"} 3
180+ llm_request_errors_total{model="phi4",reason="upstream_4xx"} 5
181+ llm_request_errors_total{model="phi4",reason="pii_policy_denied"} 8
176182```
177183
178184### Reasoning Mode Metrics
@@ -247,6 +253,35 @@ sum by (model) (increase(llm_model_cost_total{currency="USD"}[1h]))
247253sum by (reason_code) (increase(llm_routing_reason_codes_total[15m]))
248254```
249255
256+ ### Request Error Metrics
257+
258+ The router tracks request-level failures by model and reason so you can monitor both absolute error throughput and the share of requests that fail.
259+
260+ - ` llm_request_errors_total{model, reason} `
261+ - Description: Total number of request errors categorized by failure reason
262+ - Labels:
263+ - model: target model name for the failed request
264+ - reason: error category (timeout, upstream_4xx, upstream_5xx, pii_policy_denied, jailbreak_block, parse_error, serialization_error, cancellation, classification_failed, unknown)
265+
266+ Example PromQL queries:
267+
268+ ``` prometheus
269+ # Total errors by reason over the last hour
270+ sum by (reason) (increase(llm_request_errors_total[1h]))
271+
272+ # Error throughput (errors/sec) by model over the last 15 minutes.
273+ # Helpful for incident response because it shows how many failing requests are impacting users.
274+ sum by (model) (rate(llm_request_errors_total[15m]))
275+
276+ # Error ratio (% of requests failing) by model over the last 15 minutes.
277+ # Use increase() to align numerator and denominator with the same lookback window.
278+ 100 * sum by (model) (increase(llm_request_errors_total[15m])) /
279+ sum by (model) (increase(llm_model_requests_total[15m]))
280+
281+ # PII policy blocks over the last 24 hours
282+ sum(increase(llm_request_errors_total{reason="pii_policy_denied"}[24h]))
283+ ```
284+
250285### Pricing Configuration
251286
252287Provide per-1M pricing for your models so the router can compute request cost and emit metrics/logs.
0 commit comments