feat: Add production-stack profile for E2E testing framework #767

liavweiss · 2025-12-03T10:47:51Z

Description

This PR implements the production-stack profile for the E2E testing framework, enabling comprehensive testing of Semantic Router in production-grade vLLM stack environments with high availability, load balancing, and observability features.
Closes #657

Background

The E2E testing framework introduced in #655 provides an extensible profile-based architecture. This PR adds a production-stack profile to test Semantic Router deployment and functionality in production-grade vLLM stack environments, including:

Multi-replica deployments for high availability
Load balancing across replicas
Failover testing during active traffic
Performance and throughput validation
Resource utilization monitoring with Prometheus

Implementation

New Files

e2e/profiles/production-stack/profile.go (482 lines)
- Implements the Profile interface for production-stack testing
- 7-step setup process:
  1. Deploy Semantic Router (initial 1 replica)
  2. Deploy Envoy Gateway
  3. Deploy Envoy AI Gateway (CRDs + Controller)
  4. Deploy Demo LLM and Gateway API Resources
  5. Scale deployments for HA/LB (2 replicas each)
  6. Deploy Prometheus for monitoring
  7. Verify all components are ready
- Comprehensive teardown with proper resource cleanup
- Service configuration for Envoy Gateway integration
e2e/profiles/production-stack/values.yaml (169 lines)
- Minimal Semantic Router configuration optimized for HA/LB/Monitoring tests
- Includes all required classifiers (domain, PII, jailbreak)
- Semantic cache configuration
- Metrics and observability settings enabled
- Base model with LoRA adapters configuration
e2e/profiles/production-stack/prometheus-config.yaml (55 lines)
- Prometheus scrape configuration for:
  - Semantic Router metrics endpoints
  - Kubernetes pods and nodes
  - Service discovery for multiple namespaces

Key Features

High Availability Setup

Deploys Semantic Router with 2 replicas
Scales demo LLM (vllm-llama3-8b-instruct) to 2 replicas
Verifies all replicas are healthy before proceeding

Load Balancing

Configures Envoy Gateway for request distribution
Uses Gateway API resources for routing
Service discovery with proper label selectors

Observability

Deploys Prometheus with custom configuration
Scrapes metrics from Semantic Router endpoints
Monitors Kubernetes pods and nodes
Configures RBAC for Prometheus service account

Test Coverage

The profile includes both standard functional tests and production-specific tests:
Standard Tests:

chat-completions-request
chat-completions-stress-request
domain-classify
semantic-cache
pii-detection
jailbreak-detection
chat-completions-progressive-stress
Production Stack Specific Tests:
multi-replica-health - Verify all replicas are healthy
load-balancing-verification - Test request distribution across replicas
failover-during-traffic - Verify graceful failover when a replica fails
performance-throughput - Measure throughput and latency under load
resource-utilization-monitoring - Check CPU, memory, and GPU utilization

Testing

Expected Behavior

All 7 setup steps complete successfully
All deployments reach ready state (2 replicas each)
Prometheus starts scraping metrics
All test cases pass (when implemented)
Teardown cleans up all resources

Acceptance Criteria

✅ Production-stack profile directory structure created
✅ Profile interface implemented with Setup/Teardown
✅ Multi-replica deployment configuration
✅ Prometheus monitoring integration
✅ Service configuration for Envoy Gateway
✅ Comprehensive error handling and logging
✅ Resource cleanup in teardown
✅ Profile registration in main.go
✅ Production-specific test cases implementation
✅ CI integration
✅ Documentation updates

netlify · 2025-12-03T10:47:59Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`584fd79`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/69301dac5e49c40008564f75
😎 Deploy Preview	https://deploy-preview-767--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-12-03T10:56:17Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

.github/workflows/integration-test-k8s.yml

📁 `e2e`

Owners: @Xunzhuo
Files changed:

e2e/README.md
e2e/cmd/e2e/main.go
e2e/profiles/production-stack/profile.go
e2e/profiles/production-stack/prometheus-config.yaml
e2e/profiles/production-stack/values.yaml
e2e/testcases/failover_during_traffic.go
e2e/testcases/load_balancing_verification.go
e2e/testcases/multi_replica_health.go
e2e/testcases/performance_throughput.go
e2e/testcases/resource_utilization_monitoring.go

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/make/e2e.mk

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Signed-off-by: Liav Weiss <[email protected]>

liavweiss requested review from Xunzhuo and rootfs as code owners December 3, 2025 10:47

liavweiss force-pushed the feature/production-stack-profile branch from 50a8798 to 7c22480 Compare December 3, 2025 10:50

liavweiss changed the title ~~[E2E] Add production-stack profile for E2E testing framework~~ feat: Add production-stack profile for E2E testing framework Dec 3, 2025

github-actions bot assigned rootfs and Xunzhuo Dec 3, 2025

liavweiss marked this pull request as draft December 3, 2025 11:22

[e2e] Add production-stack profile

584fd79

Signed-off-by: Liav Weiss <[email protected]>

liavweiss force-pushed the feature/production-stack-profile branch from 7c22480 to 584fd79 Compare December 3, 2025 11:23

liavweiss marked this pull request as ready for review December 3, 2025 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add production-stack profile for E2E testing framework #767

feat: Add production-stack profile for E2E testing framework #767

liavweiss commented Dec 3, 2025

Uh oh!

netlify bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add production-stack profile for E2E testing framework #767

Are you sure you want to change the base?

feat: Add production-stack profile for E2E testing framework #767

Conversation

liavweiss commented Dec 3, 2025

Description

Background

Implementation

New Files

Key Features

High Availability Setup

Load Balancing

Observability

Test Coverage

Testing

Expected Behavior

Acceptance Criteria

Uh oh!

netlify bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 Root Directory

📁 e2e

📁 tools

🎉 Thanks for your contributions!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Dec 3, 2025 •

edited

Loading

github-actions bot commented Dec 3, 2025 •

edited

Loading

📁 `Root Directory`

📁 `e2e`

📁 `tools`