Skip to content

Commit a39a573

Browse files
committed
add container connectivity docs & add vsr-headers to sidebar
Signed-off-by: JaredforReal <[email protected]>
1 parent e54d751 commit a39a573

File tree

2 files changed

+192
-0
lines changed

2 files changed

+192
-0
lines changed
Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
---
2+
title: Container Connectivity Troubleshooting
3+
sidebar_label: Container Connectivity
4+
---
5+
6+
This guide summarizes common connectivity issues we hit when running the router with Docker Compose or Kubernetes and how we fixed them. It also covers the “No data” problem in Grafana and how to validate the full metrics chain.
7+
8+
## 1. Use IPv4 addresses for backend endpoints
9+
10+
Symptoms
11+
12+
- Router/Envoy timeouts, 5xx, or “up/down” flapping in Prometheus. Curl from inside containers/pods fails.
13+
14+
Root causes
15+
16+
- Backend bound only to 127.0.0.1 (not reachable from containers/pods).
17+
- Using IPv6 or hostnames that resolve to IPv6 where IPv6 is disabled/blocked.
18+
- Using localhost/127.0.0.1 in the router config, which refers to the container itself, not the host.
19+
20+
Fixes
21+
22+
- Ensure backends bind to all interfaces: 0.0.0.0.
23+
- In Docker Compose, configure the router to call the host via a reachable IPv4 address.
24+
- On macOS, host.docker.internal usually works; if not, use the host’s LAN IPv4 address.
25+
- On Linux or custom networks, use the Docker host gateway IPv4 for your network.
26+
27+
Example: start vLLM on the host
28+
29+
```bash
30+
# Make vLLM listen on all interfaces
31+
python -m vllm.entrypoints.openai.api_server \
32+
--host 0.0.0.0 --port 11434 \
33+
--served-model-name phi4
34+
```
35+
36+
Router config example (Docker Compose)
37+
38+
```yaml
39+
# config/config.yaml (snippet)
40+
llm_backends:
41+
- name: phi4
42+
# Use a reachable IPv4; replace with your host’s IP
43+
address: http://172.28.0.1:11434
44+
```
45+
46+
Kubernetes recommended pattern: use a Service
47+
48+
```yaml
49+
apiVersion: v1
50+
kind: Service
51+
metadata:
52+
name: my-vllm
53+
spec:
54+
selector:
55+
app: my-vllm
56+
ports:
57+
- name: http
58+
port: 8000
59+
targetPort: 8000
60+
```
61+
62+
Router config then uses: http://my-vllm.default.svc.cluster.local:8000
63+
64+
**Tip**: discover the host gateway from inside a container (mostly Linux)
65+
66+
```bash
67+
# Inside the container/pod
68+
ip route | awk '/default/ {print $3}'
69+
```
70+
71+
## 2. Host firewall blocking container/pod traffic
72+
73+
Symptoms
74+
75+
- Host can curl the backend, but containers/pods time out until the firewall is opened.
76+
77+
Fixes
78+
79+
- macOS: System Settings → Network → Firewall. Allow incoming connections for the backend process (e.g., Python/uvicorn) or temporarily disable the firewall to test.
80+
- Linux examples:
81+
82+
```bash
83+
# UFW (Ubuntu/Debian)
84+
sudo ufw allow 11434/tcp
85+
sudo ufw allow 11435/tcp
86+
87+
# firewalld (RHEL/CentOS/Fedora)
88+
sudo firewall-cmd --add-port=11434/tcp --permanent
89+
sudo firewall-cmd --add-port=11435/tcp --permanent
90+
sudo firewall-cmd --reload
91+
```
92+
93+
- Cloud hosts: also open security group/ACL rules.
94+
95+
Validate from the container/pod:
96+
97+
```bash
98+
docker compose exec semantic-router curl -sS http://<IPv4>:11434/v1/models
99+
```
100+
101+
## 3. Docker Compose: publish the router’s ports (not just expose)
102+
103+
Symptoms
104+
105+
- Can’t access /metrics or API from the host. docker ps shows no published ports.
106+
107+
Root cause
108+
109+
- Using `expose` only keeps ports internal to the Compose network; it doesn’t publish to the host.
110+
111+
Fix
112+
113+
- Map the needed ports with `ports:`.
114+
115+
Example docker-compose.yml snippet
116+
117+
```yaml
118+
services:
119+
semantic-router:
120+
# ...
121+
ports:
122+
- "9190:9190" # Prometheus /metrics
123+
- "50051:50051" # gRPC/HTTP API (use your actual service port)
124+
```
125+
126+
Validate from the host:
127+
128+
```bash
129+
curl -sS http://localhost:9190/metrics | head -n 5
130+
```
131+
132+
## 4. Grafana dashboard shows “No data”
133+
134+
Common causes and fixes
135+
136+
- Metrics not emitted yet
137+
- Some panels are empty until code paths are hit. Examples:
138+
- Cost: `llm_model_cost_total{currency="USD"}` grows only when cost is recorded.
139+
- Refusals: `llm_request_errors_total{reason="pii_policy_denied"|"jailbreak_block"}` grows only when policies block requests.
140+
- Generate relevant traffic or enable filters/policies to see data.
141+
142+
- Panel query nuances
143+
- Classification bar gauge often needs instant query.
144+
- Quantiles require histogram buckets.
145+
146+
Useful PromQL examples (for Explore)
147+
148+
```promql
149+
# Category classification (instant)
150+
sum by (category) (llm_category_classifications_count)
151+
152+
# Cost rate (USD/sec)
153+
sum by (model) (rate(llm_model_cost_total{currency="USD"}[5m]))
154+
155+
# Refusals per model
156+
sum by (model) (rate(llm_request_errors_total{reason=~"pii_policy_denied|jailbreak_block"}[5m]))
157+
158+
# Refusal rate percentage
159+
100 * sum by (model) (rate(llm_request_errors_total{reason=~"pii_policy_denied|jailbreak_block"}[5m]))
160+
/ sum by (model) (rate(llm_model_requests_total[5m]))
161+
162+
# Latency p95
163+
histogram_quantile(0.95, sum by (le) (rate(llm_model_completion_latency_seconds_bucket[5m])))
164+
```
165+
166+
Prometheus scrape config (verify targets are UP)
167+
168+
```yaml
169+
scrape_configs:
170+
- job_name: semantic-router
171+
static_configs:
172+
- targets: ["semantic-router:9190"]
173+
174+
- job_name: envoy
175+
metrics_path: /stats/prometheus
176+
static_configs:
177+
- targets: ["envoy-proxy:19000"]
178+
```
179+
180+
Time range & refresh
181+
182+
- Select a window that includes your recent traffic (Last 5–15 minutes) and refresh the dashboard after sending test requests.
183+
184+
## Quick checklist
185+
186+
- Backends listen on 0.0.0.0; router uses a reachable IPv4 address (or k8s Service DNS that resolves to IPv4).
187+
- Host firewall allows the backend ports; cloud SG/ACL opened if applicable.
188+
- In Docker Compose, router ports are published (e.g., 9190 for /metrics, service port for API).
189+
- Prometheus targets for `semantic-router:9190` and `envoy-proxy:19000` are UP.
190+
- Send traffic that triggers the metrics you expect (cost/refusals) and adjust panel query mode (instant vs. range) where needed.

website/sidebars.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,8 @@ const sidebars: SidebarsConfig = {
113113
label: 'Troubleshooting',
114114
items: [
115115
'troubleshooting/network-tips',
116+
'troubleshooting/container-connectivity',
117+
'troubleshooting/vsr-headers',
116118
],
117119
},
118120
],

0 commit comments

Comments
 (0)