|
| 1 | +# Latency Predictor v1 - Build Guide |
| 2 | + |
| 3 | +This directory contains the Latency Predictor v1 component with dual server architecture (training and prediction servers). Use the provided `build-deploy.sh` script to build and deploy container images to Google Cloud Platform. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- Docker (latest version) |
| 8 | +- Google Cloud SDK (`gcloud`) configured and authenticated |
| 9 | +- Required files in directory: |
| 10 | + - `training_server.py` |
| 11 | + - `prediction_server.py` |
| 12 | + - `requirements.txt` |
| 13 | + - `Dockerfile-training` |
| 14 | + - `Dockerfile-prediction` |
| 15 | + - `dual-server-deployment.yaml` |
| 16 | + |
| 17 | +**Optional (for deployment and testing):** |
| 18 | +- kubectl configured for GKE cluster access |
| 19 | + |
| 20 | +## Configuration |
| 21 | + |
| 22 | +Before running the script, update the configuration variables in `build-deploy.sh`: |
| 23 | + |
| 24 | +```bash |
| 25 | +# Edit these values in the script |
| 26 | +PROJECT_ID="your-gcp-project-id" |
| 27 | +REGION="your-gcp-region" |
| 28 | +REPOSITORY="your-artifact-registry-repo" |
| 29 | +TRAINING_IMAGE="latencypredictor-v2-training-server" |
| 30 | +PREDICTION_IMAGE="latencypredictor-v2-prediction-server" |
| 31 | +TAG="latest" |
| 32 | +``` |
| 33 | + |
| 34 | +## Usage |
| 35 | + |
| 36 | +### Build Images Only |
| 37 | + |
| 38 | +```bash |
| 39 | +# Make script executable |
| 40 | +chmod +x build-deploy.sh |
| 41 | + |
| 42 | +# Build and push images to registry |
| 43 | +./build-deploy.sh build |
| 44 | +./build-deploy.sh push |
| 45 | +``` |
| 46 | + |
| 47 | +### Complete Build and Deploy (Optional) |
| 48 | + |
| 49 | +```bash |
| 50 | +# Run complete process (build, push, deploy, test) |
| 51 | +# Note: This requires GKE cluster access |
| 52 | +./build-deploy.sh all |
| 53 | +``` |
| 54 | + |
| 55 | +### Individual Commands |
| 56 | + |
| 57 | +```bash |
| 58 | +# Check if all required files exist |
| 59 | +./build-deploy.sh check |
| 60 | + |
| 61 | +# Build Docker images only |
| 62 | +./build-deploy.sh build |
| 63 | + |
| 64 | +# Push images to Google Artifact Registry |
| 65 | +./build-deploy.sh push |
| 66 | + |
| 67 | +# Optional: Deploy to GKE cluster (requires cluster access) |
| 68 | +./build-deploy.sh deploy |
| 69 | + |
| 70 | +# Optional: Get service information and IPs |
| 71 | +./build-deploy.sh info |
| 72 | + |
| 73 | +# Optional: Test the deployed services |
| 74 | +./build-deploy.sh test |
| 75 | +``` |
| 76 | + |
| 77 | +## What the Script Does |
| 78 | + |
| 79 | +### Build Phase (`./build-deploy.sh build`) |
| 80 | +- Builds training server image from `Dockerfile-training` |
| 81 | +- Builds prediction server image from `Dockerfile-prediction` |
| 82 | +- Tags images for Google Artifact Registry |
| 83 | +- Images created: |
| 84 | + - `latencypredictor-v2-training-server:latest` |
| 85 | + - `latencypredictor-v2-prediction-server:latest` |
| 86 | + |
| 87 | +### Push Phase (`./build-deploy.sh push`) |
| 88 | +- Configures Docker for Artifact Registry authentication |
| 89 | +- Pushes both images to: |
| 90 | + - `us-docker.pkg.dev/PROJECT_ID/REPOSITORY/latencypredictor-v2-training-server:latest` |
| 91 | + - `us-docker.pkg.dev/PROJECT_ID/REPOSITORY/latencypredictor-v2-prediction-server:latest` |
| 92 | + |
| 93 | +### Deploy Phase (`./build-deploy.sh deploy`) - Optional |
| 94 | +- Applies Kubernetes manifests from `dual-server-deployment.yaml` |
| 95 | +- Waits for deployments to be ready (5-minute timeout) |
| 96 | +- Creates services: |
| 97 | + - `training-service-external` (LoadBalancer) |
| 98 | + - `prediction-service` (LoadBalancer) |
| 99 | + |
| 100 | +### Test Phase (`./build-deploy.sh test`) - Optional |
| 101 | +- Tests health endpoint: `/healthz` |
| 102 | +- Tests prediction endpoint: `/predict` with sample data |
| 103 | +- Sample prediction request: |
| 104 | + ```json |
| 105 | + { |
| 106 | + "kv_cache_percentage": 0.3, |
| 107 | + "input_token_length": 100, |
| 108 | + "num_request_waiting": 2, |
| 109 | + "num_request_running": 1, |
| 110 | + "num_tokens_generated": 50 |
| 111 | + } |
| 112 | + ``` |
| 113 | + |
| 114 | +## Setup Instructions |
| 115 | + |
| 116 | +1. **Configure GCP Authentication**: |
| 117 | + ```bash |
| 118 | + gcloud auth login |
| 119 | + gcloud config set project YOUR_PROJECT_ID |
| 120 | + ``` |
| 121 | + |
| 122 | +2. **Configure kubectl for GKE (Optional - only needed for deployment)**: |
| 123 | + ```bash |
| 124 | + gcloud container clusters get-credentials CLUSTER_NAME --zone ZONE |
| 125 | + ``` |
| 126 | + |
| 127 | +3. **Update Script Configuration**: |
| 128 | + ```bash |
| 129 | + # Edit build-deploy.sh with your project details |
| 130 | + nano build-deploy.sh |
| 131 | + ``` |
| 132 | + |
| 133 | +4. **Build Images**: |
| 134 | + ```bash |
| 135 | + ./build-deploy.sh build |
| 136 | + ./build-deploy.sh push |
| 137 | + ``` |
| 138 | + |
| 139 | +5. **Optional: Deploy and Test**: |
| 140 | + ```bash |
| 141 | + ./build-deploy.sh deploy |
| 142 | + ./build-deploy.sh test |
| 143 | + # Or run everything at once |
| 144 | + ./build-deploy.sh all |
| 145 | + ``` |
| 146 | + |
| 147 | +## Troubleshooting |
| 148 | + |
| 149 | +### Permission Issues |
| 150 | +```bash |
| 151 | +chmod +x build-deploy.sh |
| 152 | +``` |
| 153 | + |
| 154 | +### GCP Authentication |
| 155 | +```bash |
| 156 | +gcloud auth configure-docker us-docker.pkg.dev |
| 157 | +``` |
| 158 | + |
| 159 | +### Check Cluster Access |
| 160 | +```bash |
| 161 | +kubectl cluster-info |
| 162 | +kubectl get nodes |
| 163 | +``` |
| 164 | + |
| 165 | +### View Service Status |
| 166 | +```bash |
| 167 | +./build-deploy.sh info |
| 168 | +kubectl get services |
| 169 | +kubectl get pods |
| 170 | +``` |
| 171 | + |
| 172 | +### Check Logs |
| 173 | +```bash |
| 174 | +# Training server logs |
| 175 | +kubectl logs -l app=training-server |
| 176 | + |
| 177 | +# Prediction server logs |
| 178 | +kubectl logs -l app=prediction-server |
| 179 | +``` |
| 180 | + |
| 181 | +## Development Workflow |
| 182 | + |
| 183 | +1. **Make code changes** to `training_server.py` or `prediction_server.py` |
| 184 | +2. **Test locally** (optional): |
| 185 | + ```bash |
| 186 | + python training_server.py |
| 187 | + python prediction_server.py |
| 188 | + ``` |
| 189 | +3. **Build and push images**: |
| 190 | + ```bash |
| 191 | + ./build-deploy.sh build |
| 192 | + ./build-deploy.sh push |
| 193 | + ``` |
| 194 | + |
| 195 | +4. **Optional: Deploy and test**: |
| 196 | + ```bash |
| 197 | + ./build-deploy.sh deploy |
| 198 | + ./build-deploy.sh test |
| 199 | + ``` |
| 200 | + |
| 201 | +## Service Endpoints |
| 202 | + |
| 203 | +After successful deployment: |
| 204 | + |
| 205 | +- **Training Service**: External LoadBalancer IP (check with `./build-deploy.sh info`) |
| 206 | +- **Prediction Service**: External LoadBalancer IP (check with `./build-deploy.sh info`) |
| 207 | +- **Health Check**: `http://PREDICTION_IP/healthz` |
| 208 | +- **Prediction API**: `http://PREDICTION_IP/predict` (POST) |
| 209 | + |
| 210 | +## Manual Build (Alternative) |
| 211 | + |
| 212 | +If you need to build manually: |
| 213 | + |
| 214 | +```bash |
| 215 | +# Build training server |
| 216 | +docker build -f Dockerfile-training -t training-server . |
| 217 | + |
| 218 | +# Build prediction server |
| 219 | +docker build -f Dockerfile-prediction -t prediction-server . |
| 220 | +``` |
0 commit comments