Commit 858dd50
feat: add mock vLLM infrastructure for lightweight e2e testing (#228)
* feat: add mock vLLM infrastructure for lightweight e2e testing
This commit introduces a mock vLLM server infrastructure to enable e2e
testing without requiring GPU resources. The mock infrastructure simulates
intelligent routing behavior while maintaining compatibility with the
existing semantic router.
Key changes:
- Add mock-vllm-server.py: Simulates vLLM OpenAI-compatible API with
intelligent content-based routing (math queries → TinyLlama, general → Qwen)
- Add start-mock-servers.sh: Launch mock servers in foreground mode
- Update config.yaml: Add minimal vLLM endpoint configuration for
Qwen (port 8000) and TinyLlama (port 8001) with smart routing preference
- Update 00-client-request-test.py: Fix import path and use configured model
- Update e2e-tests/README.md: Document mock infrastructure usage
- Update build-run-test.mk: Add mock server management targets
The mock infrastructure enables:
- Fast e2e testing without GPU dependencies
- Content-aware model selection simulation
- vLLM API compatibility testing
- Smart routing behavior validation
Signed-off-by: Yossi Ovadia <[email protected]>
* feat: replace mock vLLM infrastructure with LLM Katan package
Replace the mock vLLM server with a real FastAPI-based implementation using HuggingFace transformers and tiny models. The new LLM Katan package provides actual inference while maintaining lightweight testing benefits.
Key changes:
- Add complete LLM Katan PyPI package (v0.1.4) under e2e-tests/
- FastAPI server with OpenAI-compatible endpoints (/v1/chat/completions, /v1/models, /health, /metrics)
- Real Qwen/Qwen3-0.6B model with name aliasing for multi-model testing
- Enhanced logging and Prometheus metrics endpoint
- CLI tool with comprehensive configuration options
- Replace start-mock-servers.sh with start-llm-katan.sh
- Update e2e-tests README with new LLM Katan usage instructions
- Remove obsolete mock-vllm-server.py and start-mock-servers.sh
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* docs: add HuggingFace token setup instructions to LLM Katan README
Add comprehensive setup section covering HuggingFace token requirements with three authentication methods:
- Environment variable (HUGGINGFACE_HUB_TOKEN)
- CLI login (huggingface-cli login)
- Token file in home directory
Explains why token is needed (private models, rate limits, reliable downloads) and provides direct link to HuggingFace token settings.
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: add Python build artifacts to .gitignore
- Add dist/, build/, *.egg-info/, *.whl to ignore Python build outputs
- Prevents accidentally committing generated files
Signed-off-by: Yossi Ovadia <[email protected]>
* refactor: separate e2e and production configs
- Create config.e2e.yaml with LLM Katan endpoints for e2e tests
- Restore config.yaml to original production endpoints (matches origin/main)
- Add run-router-e2e target to use e2e config (config/config.e2e.yaml)
- Add start-llm-katan and test-e2e-vllm targets for LLM Katan testing
- Update Makefile help with new e2e test targets
- Remove egg-info directory from git tracking (now in .gitignore)
- Keep pyproject.toml at stable version 0.1.4, always install latest via pip
This separation allows:
- Production config stays clean with real vLLM endpoints
- E2E tests use lightweight LLM Katan servers
- Clear distinction between test and production environments
- Always use latest LLM Katan features via unpinned pip installation
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: update e2e test to use model from config.e2e.yaml
- Change test model from 'gemma3:27b' to 'Qwen/Qwen2-0.5B-Instruct'
- Ensures Envoy health check uses model available in e2e config
- Fixes 503 errors when checking if Envoy proxy is running
Signed-off-by: Yossi Ovadia <[email protected]>
* Update llm-katan package metadata
- Bump version to 0.1.6 for PyPI publishing
- Change license from MIT to Apache-2.0
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* Fix Apache license classifier in pyproject.toml
- Update license classifier from MIT to Apache Software License
- Bump version to 0.1.7 for corrected license display on PyPI
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: resolve pre-commit hook failures
- Fix markdown linting issues (MD032, MD031, MD047) in README files
- Remove binary distribution files from git tracking
- Add Python build artifacts to .gitignore
- Auto-format Python files with black and isort
- Add CLAUDE.md exclusion to prevent future commits
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: update llm-katan project URLs to vllm-project repository
Update repository URLs in pyproject.toml to point to the correct vllm-project
organization instead of personal fork.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: revert config.yaml to original main branch version
Revert production config.yaml to original state from main branch.
The config modifications were not intended for this PR and should
remain unchanged to preserve production configuration.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
* fix: restore config.yaml to match upstream main exactly
Copy config.yaml from upstream main to ensure it matches exactly
and includes the health_check_path and other missing fields.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
---------
Signed-off-by: Yossi Ovadia <[email protected]>
Co-authored-by: Claude <[email protected]>1 parent f1ec72b commit 858dd50
File tree
17 files changed
+1847
-24
lines changed- config
- e2e-tests
- llm-katan
- llm_katan
- tools/make
17 files changed
+1847
-24
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
16 | 23 | | |
17 | 24 | | |
18 | 25 | | |
| |||
117 | 124 | | |
118 | 125 | | |
119 | 126 | | |
120 | | - | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
0 commit comments