feat: enable E2E testing with LLM Katan - 00-client-request-test #290

yossiovadia · 2025-09-29T17:05:37Z

This is the first, basic validation test based on the new llm-katan infrastructure.

Remove Ollama dependencies from E2E config as requested
Update config.e2e.yaml to use only LLM Katan models (Qwen/Qwen2-0.5B-Instruct, TinyLlama/TinyLlama-1.1B-Chat-v1.0)
Fix bash 3.2 compatibility in start-llm-katan.sh (replace associative arrays), can now run on Macbook
remove hardcoded version of llm-katan.

Testing Status:

✅ Router: Successfully starts with E2E config (ExtProc on :50051, API on :8080)
✅ LLM Katan: Running on ports 8000/8001 with correct model mapping
✅ Envoy: Running on port 8801
✅ Test: 00-client-request-test.py passes with 200 OK responses
✅ Malformed Request Test - Validates that malformed requests are properly rejected with appropriate error codes
✅ Pipeline: Full end-to-end flow working (Client → Envoy → ExtProc → LLM Katan)

Release Notes: No

- Remove Ollama dependencies from E2E config as requested - Update config.e2e.yaml to use only LLM Katan models (Qwen/Qwen2-0.5B-Instruct, TinyLlama/TinyLlama-1.1B-Chat-v1.0) - Fix bash 3.2 compatibility in start-llm-katan.sh (replace associative arrays) - Add required use_reasoning fields to all model entries for validation - Fix zero scores in model configurations (0.0 → 0.1) Testing Status: - ✅ Router: Successfully starts with E2E config (ExtProc on :50051, API on :8080) - ✅ LLM Katan: Running on ports 8000/8001 with correct model mapping - ✅ Envoy: Running on port 8801 - ✅ Test: 00-client-request-test.py passes with 200 OK responses - ✅ Pipeline: Full end-to-end flow working (Client → Envoy → ExtProc → LLM Katan) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>

netlify · 2025-09-29T17:05:43Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`4c9a66a`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68dac3bf1a98e300085d9b84
😎 Deploy Preview	https://deploy-preview-290--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-09-29T17:06:05Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `config`

Owners: @rootfs
Files changed:

config/config.e2e.yaml

📁 `e2e-tests`

Owners: @yossiovadia
Files changed:

e2e-tests/00-client-request-test.py
e2e-tests/README.md
e2e-tests/llm-katan/llm_katan/__init__.py
e2e-tests/llm-katan/llm_katan/cli.py
e2e-tests/llm-katan/llm_katan/server.py
e2e-tests/start-llm-katan.sh

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

rootfs · 2025-09-29T17:14:25Z

@yossiovadia can you run pre-commit to fix the lint error?

rootfs · 2025-09-29T17:15:21Z

e2e-tests/start-llm-katan.sh

-    ["8001"]="Qwen/Qwen3-0.6B::TinyLlama/TinyLlama-1.1B-Chat-v1.0"
+# Format: "port:real_model::served_model_name"
+LLM_KATAN_MODELS=(
+    "8000:Qwen/Qwen3-0.6B::Qwen/Qwen2-0.5B-Instruct"


Qwen2-0.5B is an odd name :D

Maybe we can just use Model-A, Model-B?

agree.. fixin`

Apply black and isort formatting to LLM Katan Python files as required by pre-commit hooks. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>

- Update LLM Katan configuration to use simplified model names - Simplify 00-client-request-test.py to use Model-A as default - Update documentation to reflect math → Model-B, creative → Model-A routing - Improve test readability and maintainability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>

- Fix markdown linting issues in CLAUDE.md files - Apply black formatting to Python files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>

…m-project#290) * feat: enable E2E testing with LLM Katan and fix configuration - Remove Ollama dependencies from E2E config as requested - Update config.e2e.yaml to use only LLM Katan models (Qwen/Qwen2-0.5B-Instruct, TinyLlama/TinyLlama-1.1B-Chat-v1.0) - Fix bash 3.2 compatibility in start-llm-katan.sh (replace associative arrays) - Add required use_reasoning fields to all model entries for validation - Fix zero scores in model configurations (0.0 → 0.1) Testing Status: - ✅ Router: Successfully starts with E2E config (ExtProc on :50051, API on :8080) - ✅ LLM Katan: Running on ports 8000/8001 with correct model mapping - ✅ Envoy: Running on port 8801 - ✅ Test: 00-client-request-test.py passes with 200 OK responses - ✅ Pipeline: Full end-to-end flow working (Client → Envoy → ExtProc → LLM Katan) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> * fix: apply pre-commit formatting fixes Apply black and isort formatting to LLM Katan Python files as required by pre-commit hooks. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> * refactor: simplify model names to Model-A and Model-B for E2E testing - Update LLM Katan configuration to use simplified model names - Simplify 00-client-request-test.py to use Model-A as default - Update documentation to reflect math → Model-B, creative → Model-A routing - Improve test readability and maintainability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> * fix: apply pre-commit formatting fixes - Fix markdown linting issues in CLAUDE.md files - Apply black formatting to Python files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> --------- Signed-off-by: Yossi Ovadia <[email protected]> Co-authored-by: Claude <[email protected]> Signed-off-by: liuhy <[email protected]>

…m-project#290) * feat: enable E2E testing with LLM Katan and fix configuration - Remove Ollama dependencies from E2E config as requested - Update config.e2e.yaml to use only LLM Katan models (Qwen/Qwen2-0.5B-Instruct, TinyLlama/TinyLlama-1.1B-Chat-v1.0) - Fix bash 3.2 compatibility in start-llm-katan.sh (replace associative arrays) - Add required use_reasoning fields to all model entries for validation - Fix zero scores in model configurations (0.0 → 0.1) Testing Status: - ✅ Router: Successfully starts with E2E config (ExtProc on :50051, API on :8080) - ✅ LLM Katan: Running on ports 8000/8001 with correct model mapping - ✅ Envoy: Running on port 8801 - ✅ Test: 00-client-request-test.py passes with 200 OK responses - ✅ Pipeline: Full end-to-end flow working (Client → Envoy → ExtProc → LLM Katan) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> * fix: apply pre-commit formatting fixes Apply black and isort formatting to LLM Katan Python files as required by pre-commit hooks. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> * refactor: simplify model names to Model-A and Model-B for E2E testing - Update LLM Katan configuration to use simplified model names - Simplify 00-client-request-test.py to use Model-A as default - Update documentation to reflect math → Model-B, creative → Model-A routing - Improve test readability and maintainability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> * fix: apply pre-commit formatting fixes - Fix markdown linting issues in CLAUDE.md files - Apply black formatting to Python files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]> --------- Signed-off-by: Yossi Ovadia <[email protected]> Co-authored-by: Claude <[email protected]>

yossiovadia requested review from Xunzhuo and rootfs as code owners September 29, 2025 17:05

Merge branch 'main' into e2e-00-client-request-test

b79680a

github-actions bot assigned rootfs and yossiovadia Sep 29, 2025

rootfs reviewed Sep 29, 2025

View reviewed changes

yossiovadia and others added 3 commits September 29, 2025 10:20

fix: apply pre-commit formatting fixes

000b1f7

Apply black and isort formatting to LLM Katan Python files as required by pre-commit hooks. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>

fix: apply pre-commit formatting fixes

4c9a66a

- Fix markdown linting issues in CLAUDE.md files - Apply black formatting to Python files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>

rootfs approved these changes Sep 29, 2025

View reviewed changes

rootfs merged commit bbc88bb into vllm-project:main Sep 29, 2025
9 checks passed

yossiovadia deleted the e2e-00-client-request-test branch September 29, 2025 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: enable E2E testing with LLM Katan - 00-client-request-test #290

feat: enable E2E testing with LLM Katan - 00-client-request-test #290

Uh oh!

yossiovadia commented Sep 29, 2025

Uh oh!

netlify bot commented Sep 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 29, 2025 •

edited

Loading

Uh oh!

rootfs commented Sep 29, 2025

Uh oh!

rootfs Sep 29, 2025 •

edited

Loading

Uh oh!

yossiovadia Sep 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: enable E2E testing with LLM Katan - 00-client-request-test #290

feat: enable E2E testing with LLM Katan - 00-client-request-test #290

Uh oh!

Conversation

yossiovadia commented Sep 29, 2025

Uh oh!

netlify bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 config

📁 e2e-tests

🎉 Thanks for your contributions!

Uh oh!

rootfs commented Sep 29, 2025

Uh oh!

rootfs Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yossiovadia Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netlify bot commented Sep 29, 2025 •

edited

Loading

github-actions bot commented Sep 29, 2025 •

edited

Loading

📁 `config`

📁 `e2e-tests`

rootfs Sep 29, 2025 •

edited

Loading