Skip to content

Commit f3bd449

Browse files
authored
add latency predictor build readme and update test (#1601)
* add latency predictor build readme * update test dual server
1 parent db08577 commit f3bd449

File tree

2 files changed

+677
-1022
lines changed

2 files changed

+677
-1022
lines changed

latencypredictor-v1/README.md

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# Latency Predictor v1 - Build Guide
2+
3+
This directory contains the Latency Predictor v1 component with dual server architecture (training and prediction servers). Use the provided `build-deploy.sh` script to build and deploy container images to Google Cloud Platform.
4+
5+
## Prerequisites
6+
7+
- Docker (latest version)
8+
- Google Cloud SDK (`gcloud`) configured and authenticated
9+
- Required files in directory:
10+
- `training_server.py`
11+
- `prediction_server.py`
12+
- `requirements.txt`
13+
- `Dockerfile-training`
14+
- `Dockerfile-prediction`
15+
- `dual-server-deployment.yaml`
16+
17+
**Optional (for deployment and testing):**
18+
- kubectl configured for GKE cluster access
19+
20+
## Configuration
21+
22+
Before running the script, update the configuration variables in `build-deploy.sh`:
23+
24+
```bash
25+
# Edit these values in the script
26+
PROJECT_ID="your-gcp-project-id"
27+
REGION="your-gcp-region"
28+
REPOSITORY="your-artifact-registry-repo"
29+
TRAINING_IMAGE="latencypredictor-v2-training-server"
30+
PREDICTION_IMAGE="latencypredictor-v2-prediction-server"
31+
TAG="latest"
32+
```
33+
34+
## Usage
35+
36+
### Build Images Only
37+
38+
```bash
39+
# Make script executable
40+
chmod +x build-deploy.sh
41+
42+
# Build and push images to registry
43+
./build-deploy.sh build
44+
./build-deploy.sh push
45+
```
46+
47+
### Complete Build and Deploy (Optional)
48+
49+
```bash
50+
# Run complete process (build, push, deploy, test)
51+
# Note: This requires GKE cluster access
52+
./build-deploy.sh all
53+
```
54+
55+
### Individual Commands
56+
57+
```bash
58+
# Check if all required files exist
59+
./build-deploy.sh check
60+
61+
# Build Docker images only
62+
./build-deploy.sh build
63+
64+
# Push images to Google Artifact Registry
65+
./build-deploy.sh push
66+
67+
# Optional: Deploy to GKE cluster (requires cluster access)
68+
./build-deploy.sh deploy
69+
70+
# Optional: Get service information and IPs
71+
./build-deploy.sh info
72+
73+
# Optional: Test the deployed services
74+
./build-deploy.sh test
75+
```
76+
77+
## What the Script Does
78+
79+
### Build Phase (`./build-deploy.sh build`)
80+
- Builds training server image from `Dockerfile-training`
81+
- Builds prediction server image from `Dockerfile-prediction`
82+
- Tags images for Google Artifact Registry
83+
- Images created:
84+
- `latencypredictor-v2-training-server:latest`
85+
- `latencypredictor-v2-prediction-server:latest`
86+
87+
### Push Phase (`./build-deploy.sh push`)
88+
- Configures Docker for Artifact Registry authentication
89+
- Pushes both images to:
90+
- `us-docker.pkg.dev/PROJECT_ID/REPOSITORY/latencypredictor-v2-training-server:latest`
91+
- `us-docker.pkg.dev/PROJECT_ID/REPOSITORY/latencypredictor-v2-prediction-server:latest`
92+
93+
### Deploy Phase (`./build-deploy.sh deploy`) - Optional
94+
- Applies Kubernetes manifests from `dual-server-deployment.yaml`
95+
- Waits for deployments to be ready (5-minute timeout)
96+
- Creates services:
97+
- `training-service-external` (LoadBalancer)
98+
- `prediction-service` (LoadBalancer)
99+
100+
### Test Phase (`./build-deploy.sh test`) - Optional
101+
- Tests health endpoint: `/healthz`
102+
- Tests prediction endpoint: `/predict` with sample data
103+
- Sample prediction request:
104+
```json
105+
{
106+
"kv_cache_percentage": 0.3,
107+
"input_token_length": 100,
108+
"num_request_waiting": 2,
109+
"num_request_running": 1,
110+
"num_tokens_generated": 50
111+
}
112+
```
113+
114+
## Setup Instructions
115+
116+
1. **Configure GCP Authentication**:
117+
```bash
118+
gcloud auth login
119+
gcloud config set project YOUR_PROJECT_ID
120+
```
121+
122+
2. **Configure kubectl for GKE (Optional - only needed for deployment)**:
123+
```bash
124+
gcloud container clusters get-credentials CLUSTER_NAME --zone ZONE
125+
```
126+
127+
3. **Update Script Configuration**:
128+
```bash
129+
# Edit build-deploy.sh with your project details
130+
nano build-deploy.sh
131+
```
132+
133+
4. **Build Images**:
134+
```bash
135+
./build-deploy.sh build
136+
./build-deploy.sh push
137+
```
138+
139+
5. **Optional: Deploy and Test**:
140+
```bash
141+
./build-deploy.sh deploy
142+
./build-deploy.sh test
143+
# Or run everything at once
144+
./build-deploy.sh all
145+
```
146+
147+
## Troubleshooting
148+
149+
### Permission Issues
150+
```bash
151+
chmod +x build-deploy.sh
152+
```
153+
154+
### GCP Authentication
155+
```bash
156+
gcloud auth configure-docker us-docker.pkg.dev
157+
```
158+
159+
### Check Cluster Access
160+
```bash
161+
kubectl cluster-info
162+
kubectl get nodes
163+
```
164+
165+
### View Service Status
166+
```bash
167+
./build-deploy.sh info
168+
kubectl get services
169+
kubectl get pods
170+
```
171+
172+
### Check Logs
173+
```bash
174+
# Training server logs
175+
kubectl logs -l app=training-server
176+
177+
# Prediction server logs
178+
kubectl logs -l app=prediction-server
179+
```
180+
181+
## Development Workflow
182+
183+
1. **Make code changes** to `training_server.py` or `prediction_server.py`
184+
2. **Test locally** (optional):
185+
```bash
186+
python training_server.py
187+
python prediction_server.py
188+
```
189+
3. **Build and push images**:
190+
```bash
191+
./build-deploy.sh build
192+
./build-deploy.sh push
193+
```
194+
195+
4. **Optional: Deploy and test**:
196+
```bash
197+
./build-deploy.sh deploy
198+
./build-deploy.sh test
199+
```
200+
201+
## Service Endpoints
202+
203+
After successful deployment:
204+
205+
- **Training Service**: External LoadBalancer IP (check with `./build-deploy.sh info`)
206+
- **Prediction Service**: External LoadBalancer IP (check with `./build-deploy.sh info`)
207+
- **Health Check**: `http://PREDICTION_IP/healthz`
208+
- **Prediction API**: `http://PREDICTION_IP/predict` (POST)
209+
210+
## Manual Build (Alternative)
211+
212+
If you need to build manually:
213+
214+
```bash
215+
# Build training server
216+
docker build -f Dockerfile-training -t training-server .
217+
218+
# Build prediction server
219+
docker build -f Dockerfile-prediction -t prediction-server .
220+
```

0 commit comments

Comments
 (0)