Skip to content

Commit 0a08e67

Browse files
authored
docs: network tips (#208)
1 parent 53f52b1 commit 0a08e67

File tree

2 files changed

+195
-0
lines changed

2 files changed

+195
-0
lines changed
Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
---
2+
title: Network Tips
3+
sidebar_label: Network Tips
4+
---
5+
6+
This guide shows how to build and run in restricted or slow network environments without modifying repo files. You’ll use small local override files and a compose override so the codebase stays clean.
7+
8+
What you’ll solve:
9+
10+
- Hugging Face model downloads blocked/slow
11+
- Go modules fetching blocked during Docker build
12+
- PyPI access for the mock-vLLM test image
13+
14+
## TL;DR: Choose your path
15+
16+
- Fastest and most reliable: use local models in `./models` and skip HF network entirely.
17+
- Otherwise: mount an HF cache + set mirror env vars via a compose override.
18+
- For building: use an override Dockerfile to set Go mirrors (examples provided).
19+
- For mock-vllm: use an override Dockerfile to set pip mirror (examples provided).
20+
21+
You can mix these based on your situation.
22+
23+
## 1. Hugging Face models
24+
25+
The router will download embedding models on first run unless you provide them locally. Prefer Option A if possible.
26+
27+
### Option A — Use local models (no external network)
28+
29+
1) Download the required model(s) with any reachable method (VPN/offline) into the repo’s `./models` folder. Example layout:
30+
31+
- `models/all-MiniLM-L12-v2/`
32+
- `models/category_classifier_modernbert-base_model`
33+
34+
2) In `config/config.yaml`, point to the local path. Example:
35+
36+
```yaml
37+
bert_model:
38+
# point to a local folder under /app/models (already mounted by compose)
39+
model_id: /app/models/all-MiniLM-L12-v2
40+
```
41+
42+
3) No extra env is required. `docker-compose.yml` already mounts `./models:/app/models:ro`.
43+
44+
### Option B — Use HF cache + mirror
45+
46+
Create a compose override to persist cache and use a regional mirror (example below uses a China mirror). Save as `docker-compose.override.yml` in the repo root:
47+
48+
```yaml
49+
services:
50+
semantic-router:
51+
volumes:
52+
- ~/.cache/huggingface:/root/.cache/huggingface
53+
environment:
54+
- HUGGINGFACE_HUB_CACHE=/root/.cache/huggingface
55+
- HF_HUB_ENABLE_HF_TRANSFER=1
56+
- HF_ENDPOINT=https://hf-mirror.com # example mirror endpoint (China)
57+
```
58+
59+
Optional: pre-warm cache on the host (only if you have `huggingface_hub` installed):
60+
61+
```bash
62+
python -m pip install -U huggingface_hub
63+
python - <<'PY'
64+
from huggingface_hub import snapshot_download
65+
snapshot_download(repo_id="sentence-transformers/all-MiniLM-L6-v2", local_dir="~/.cache/huggingface/hub/models--sentence-transformers--all-MiniLM-L6-v2")
66+
PY
67+
```
68+
69+
## 2. Build with Go mirrors (Dockerfile override)
70+
71+
When building `Dockerfile.extproc`, the Go stage may hang on `proxy.golang.org`. Create an override Dockerfile that enables mirrors without touching the original.
72+
73+
1) Create `Dockerfile.extproc.cn` at repo root with this content:
74+
75+
```Dockerfile
76+
# syntax=docker/dockerfile:1
77+
78+
FROM rust:1.85 AS rust-builder
79+
RUN apt-get update && apt-get install -y make build-essential pkg-config && rm -rf /var/lib/apt/lists/*
80+
WORKDIR /app
81+
COPY tools/make/ tools/make/
82+
COPY Makefile ./
83+
COPY candle-binding/Cargo.toml candle-binding/
84+
COPY candle-binding/src/ candle-binding/src/
85+
RUN make rust
86+
87+
FROM golang:1.24 AS go-builder
88+
WORKDIR /app
89+
90+
# Go module mirrors (example: goproxy.cn)
91+
ENV GOPROXY=https://goproxy.cn,direct
92+
ENV GOSUMDB=sum.golang.google.cn
93+
94+
RUN mkdir -p src/semantic-router
95+
COPY src/semantic-router/go.mod src/semantic-router/go.sum src/semantic-router/
96+
COPY candle-binding/go.mod candle-binding/semantic-router.go candle-binding/
97+
98+
# Pre-download modules to fail fast if mirrors are unreachable
99+
RUN cd src/semantic-router && go mod download && \
100+
cd /app/candle-binding && go mod download
101+
102+
COPY src/semantic-router/ src/semantic-router/
103+
COPY --from=rust-builder /app/candle-binding/target/release/libcandle_semantic_router.so /app/candle-binding/target/release/
104+
105+
ENV CGO_ENABLED=1
106+
ENV LD_LIBRARY_PATH=/app/candle-binding/target/release
107+
RUN mkdir -p bin && cd src/semantic-router && go build -o ../../bin/router cmd/main.go
108+
109+
FROM quay.io/centos/centos:stream9
110+
WORKDIR /app
111+
COPY --from=go-builder /app/bin/router /app/extproc-server
112+
COPY --from=go-builder /app/candle-binding/target/release/libcandle_semantic_router.so /app/lib/
113+
COPY config/config.yaml /app/config/
114+
ENV LD_LIBRARY_PATH=/app/lib
115+
EXPOSE 50051
116+
COPY scripts/entrypoint.sh /app/entrypoint.sh
117+
RUN chmod +x /app/entrypoint.sh
118+
ENTRYPOINT ["/app/entrypoint.sh"]
119+
```
120+
121+
2) Point compose to the override Dockerfile by extending `docker-compose.override.yml`:
122+
123+
```yaml
124+
services:
125+
semantic-router:
126+
build:
127+
dockerfile: Dockerfile.extproc.cn
128+
```
129+
130+
## 3. Mock vLLM (PyPI mirror via Dockerfile override)
131+
132+
For the optional testing profile, create an override Dockerfile to configure pip mirrors.
133+
134+
1) Create `tools/mock-vllm/Dockerfile.cn`:
135+
136+
```Dockerfile
137+
FROM python:3.11-slim
138+
WORKDIR /app
139+
RUN apt-get update && apt-get install -y --no-install-recommends curl && rm -rf /var/lib/apt/lists/*
140+
141+
# Pip mirror (example: TUNA mirror in China)
142+
RUN python -m pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple && \
143+
python -m pip config set global.trusted-host pypi.tuna.tsinghua.edu.cn
144+
145+
COPY requirements.txt /app/requirements.txt
146+
RUN pip install --no-cache-dir -r requirements.txt
147+
148+
COPY app.py /app/app.py
149+
EXPOSE 8000
150+
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
151+
```
152+
153+
2) Extend `docker-compose.override.yml` to use the override Dockerfile for `mock-vllm`:
154+
155+
```yaml
156+
services:
157+
mock-vllm:
158+
build:
159+
dockerfile: Dockerfile.cn
160+
```
161+
162+
## 4. Build and run
163+
164+
With the overrides in place, build and run normally (Compose will auto-merge):
165+
166+
```bash
167+
# Build all images with overrides
168+
docker compose -f docker-compose.yml -f docker-compose.override.yml build
169+
170+
# Run router + envoy
171+
docker compose -f docker-compose.yml -f docker-compose.override.yml up -d
172+
173+
# If you need the testing profile (mock-vllm)
174+
docker compose -f docker-compose.yml -f docker-compose.override.yml --profile testing up -d
175+
```
176+
177+
## 5. Troubleshooting
178+
179+
- Go modules still time out:
180+
- Verify `GOPROXY` and `GOSUMDB` are present in the go-builder stage logs.
181+
- Try a clean build: `docker compose build --no-cache`.
182+
183+
- HF models still download slowly:
184+
- Prefer Option A (local models).
185+
- Ensure the cache volume is mounted and `HF_ENDPOINT`/`HF_HUB_ENABLE_HF_TRANSFER` are set.
186+
187+
- PyPI slow for mock-vllm:
188+
- Confirm the CN Dockerfile is being used for that service.

website/sidebars.js

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,13 @@ const sidebars = {
6969
'api/classification',
7070
],
7171
},
72+
{
73+
type: 'category',
74+
label: 'Troubleshooting',
75+
items: [
76+
'troubleshooting/network-tips',
77+
],
78+
},
7279
],
7380
}
7481

0 commit comments

Comments
 (0)