Skip to content

Commit def0459

Browse files
committed
feat: Add istio profile for E2E testing framework
This PR implements the Istio profile for the E2E testing framework as requested in issue vllm-project#656. ## What Changed ### Core Implementation (Issue vllm-project#656 Requirements) - **New Istio Profile** (`e2e/profiles/istio/`) - Implements 5-stage deployment: Istio control plane, namespace configuration, Semantic Router with sidecar injection, Istio Gateway/VirtualService/DestinationRule, and environment verification - Robust error handling with panic recovery and defer-based cleanup - Configurable Istio version via `ISTIO_VERSION` environment variable (default: 1.28.0) - Comprehensive service health verification with 60-second stabilization period - **4 Istio-Specific Test Cases** (100% passing) 1. `istio-sidecar-health-check` - Verify Envoy sidecar injection and health 2. `istio-traffic-routing` - Test routing through Istio ingress gateway 3. `istio-mtls-verification` - Verify mutual TLS configuration and certificates 4. `istio-tracing-observability` - Validate distributed tracing and metrics - **Integration & Documentation** - Registered profile in `e2e/cmd/e2e/main.go` - Added comprehensive documentation to `e2e/README.md` - Integrated into CI matrix (`.github/workflows/integration-test-k8s.yml`) - Updated Make targets help text (`tools/make/e2e.mk`) ### Out-of-Scope Changes (Justified) #### Helper Function Enhancement (`e2e/pkg/helpers/kubernetes.go`) - **Change**: Added `GetServiceByLabelInNamespace()` with backward-compatible wrapper - **Why**: Original `GetEnvoyServiceName()` was hardcoded to `"envoy-gateway-system"`, preventing reuse for Istio which uses `"istio-system"` namespace - **Approach**: Backward-compatible wrapper pattern - Old function `GetEnvoyServiceName(ctx, client, labelSelector, verbose)` still works - Calls new generic function `GetServiceByLabelInNamespace()` with hardcoded namespace - Istio uses the new flexible function directly - **Impact**: - **ZERO changes** to existing profiles (ai-gateway, aibrix, dynamic-config) - Makes helper function generic and reusable for future profiles - Marked old function as deprecated for future cleanup - **Safety**: Existing profiles continue to work unchanged, Istio gets flexibility it needs ## Test Coverage This PR implements the **4 Istio-specific tests** required by issue vllm-project#656: - ✅ Basic health check with Istio sidecar - ✅ Traffic routing through Istio gateway - ✅ mTLS verification - ✅ Request tracing and observability **Test Results**: 4/4 passing (100%) **Note on Signal-Decision Engine Tests**: Signal-decision engine tests (introduced in PR vllm-project#695) are intentionally excluded to maintain focus on Istio integration validation. These tests validate semantic router logic (already covered by ai-gateway, aibrix, and dynamic-config profiles) rather than Istio mesh functionality. Following the established pattern where PR vllm-project#695 retroactively added signal-decision tests to existing profiles, these can be added to the Istio profile in a follow-up PR. **Note on Kubernetes Version**: CI uses Kind v0.22.0 which defaults to Kubernetes 1.29.2, meeting Istio 1.28+ compatibility requirements without requiring explicit version pinning. ## Implementation Highlights - 5-stage deployment with comprehensive error handling - Panic recovery with defer-based cleanup - Service health verification before tests (prevents 503 errors) - 60-second stabilization period for mesh readiness - Complete Istio resource cleanup in teardown - All Kubernetes resources properly namespaced - Configurable Istio version for testing flexibility ## Testing Done ✅ All 4 Istio test cases pass successfully ✅ Verified cluster lifecycle management (create/cleanup) ✅ Confirmed Istio sidecar injection and health ✅ Validated traffic routing through Istio gateway ✅ Verified mTLS configuration and certificates ✅ Confirmed distributed tracing headers and metrics ✅ Tested with Istio 1.28.0 and CI's default Kubernetes version ## Makefile Targets ```bash make e2e-test E2E_PROFILE=istio # Run all Istio tests make e2e-test E2E_PROFILE=istio E2E_VERBOSE=1 # Run with verbose output ISTIO_VERSION=1.28.0 make e2e-test E2E_PROFILE=istio # Custom Istio version ``` Resolves vllm-project#656 Signed-off-by: Asaad Balum <[email protected]>
1 parent c3ce62e commit def0459

File tree

11 files changed

+1494
-10
lines changed

11 files changed

+1494
-10
lines changed

.github/workflows/integration-test-k8s.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616
strategy:
1717
fail-fast: false # Continue testing other profiles even if one fails
1818
matrix:
19-
profile: [ai-gateway, aibrix]
19+
profile: [ai-gateway, aibrix, istio]
2020

2121
steps:
2222
- name: Check out the repo

e2e/README.md

Lines changed: 134 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The framework follows a **separation of concerns** design:
1414

1515
- **ai-gateway**: Tests Semantic Router with Envoy AI Gateway integration
1616
- **aibrix**: Tests Semantic Router with vLLM AIBrix integration
17-
- **istio**: Tests Semantic Router with Istio Gateway (future)
17+
- **istio**: Tests Semantic Router with Istio service mesh integration
1818
- **production-stack**: Tests vLLM Production Stack configurations (future)
1919
- **llm-d**: Tests with LLM-D (future)
2020
- **dynamo**: Tests with Nvidia Dynamo (future)
@@ -517,3 +517,136 @@ func (p *Profile) GetServiceConfig() framework.ServiceConfig {
517517
```
518518

519519
See `profiles/ai-gateway/` for a complete example.
520+
521+
## Profile Details
522+
523+
### Istio Profile
524+
525+
The Istio profile tests Semantic Router deployment and functionality in an Istio service mesh environment.
526+
527+
**What it Tests:**
528+
529+
- Istio sidecar injection and health
530+
- Traffic routing through Istio ingress gateway
531+
- Mutual TLS (mTLS) between services
532+
- Distributed tracing and observability
533+
534+
**Prerequisites:**
535+
536+
- `istioctl` must be installed on the system
537+
- Docker and Kind (managed by E2E framework)
538+
539+
**Components Deployed:**
540+
541+
1. **Istio Control Plane** (`istio-system` namespace):
542+
- `istiod` - Istio control plane
543+
- `istio-ingressgateway` - Ingress gateway for external traffic
544+
545+
2. **Semantic Router** (`semantic-router` namespace):
546+
- Deployed via Helm with Istio sidecar injection enabled
547+
- Namespace labeled with `istio-injection=enabled`
548+
549+
3. **Istio Resources**:
550+
- `Gateway` - Configures ingress gateway on port 80
551+
- `VirtualService` - Routes traffic to Semantic Router service
552+
- `DestinationRule` - Enables mTLS with `ISTIO_MUTUAL` mode
553+
554+
**Test Cases:**
555+
556+
| Test Case | Description | What it Validates |
557+
|-----------|-------------|-------------------|
558+
| `istio-sidecar-health-check` | Verify Envoy sidecar injection | - Istio-proxy container exists<br>- Sidecar is healthy and ready<br>- Namespace has `istio-injection=enabled` label |
559+
| `istio-traffic-routing` | Test routing through Istio gateway | - Gateway and VirtualService exist<br>- Requests route correctly to Semantic Router<br>- Istio/Envoy headers present in responses |
560+
| `istio-mtls-verification` | Verify mutual TLS configuration | - DestinationRule has `ISTIO_MUTUAL` mode<br>- mTLS certificates present in istio-proxy<br>- PeerAuthentication policy (if configured) |
561+
| `istio-tracing-observability` | Check distributed tracing and metrics | - Trace headers propagated<br>- Envoy metrics exposed<br>- Telemetry configuration<br>- Access logs enabled |
562+
563+
**Usage:**
564+
565+
```bash
566+
# Run all Istio tests
567+
make e2e-test E2E_PROFILE=istio
568+
569+
# Run specific Istio tests
570+
make e2e-test-specific E2E_PROFILE=istio E2E_TESTS="istio-sidecar-health-check,istio-mtls-verification"
571+
572+
# Run with verbose output
573+
./bin/e2e -profile istio -verbose
574+
575+
# Keep cluster for debugging
576+
make e2e-test E2E_PROFILE=istio E2E_KEEP_CLUSTER=true
577+
```
578+
579+
**Architecture:**
580+
581+
```
582+
┌─────────────────────────────────────────┐
583+
│ Istio Ingress Gateway │
584+
│ (istio-system namespace) │
585+
│ Port 80 → semantic-router service │
586+
└────────────┬────────────────────────────┘
587+
588+
589+
┌─────────────────────────────────────────┐
590+
│ Semantic Router Pod │
591+
│ (semantic-router namespace) │
592+
│ ┌─────────────┐ ┌──────────────────┐ │
593+
│ │ Main │ │ Istio-Proxy │ │
594+
│ │ Container │◄─┤ (Envoy Sidecar) │ │
595+
│ │ │ │ │ │
596+
│ │ :8801 │ │ mTLS, Tracing │ │
597+
│ └─────────────┘ └──────────────────┘ │
598+
└─────────────────────────────────────────┘
599+
600+
601+
┌─────────────────────────────────────────┐
602+
│ Istiod (Control Plane) │
603+
│ - Config distribution │
604+
│ - Certificate management (mTLS) │
605+
│ - Sidecar injection │
606+
└─────────────────────────────────────────┘
607+
```
608+
609+
**Key Features Tested:**
610+
611+
-**Automatic Sidecar Injection**: Istio automatically injects Envoy proxy sidecars into pods
612+
-**Traffic Management**: Requests route through Istio Gateway → VirtualService → Semantic Router
613+
-**Security (mTLS)**: Automatic mutual TLS encryption and authentication between services
614+
-**Observability**: Distributed tracing, metrics collection, and access logs
615+
-**Service Mesh Integration**: Semantic Router operates correctly within Istio mesh
616+
617+
**Setup Steps (Automated by Profile):**
618+
619+
1. Install Istio control plane using `istioctl install`
620+
2. Create namespace with `istio-injection=enabled` label
621+
3. Deploy Semantic Router via Helm (sidecar auto-injected)
622+
4. Create Istio Gateway and VirtualService for traffic routing
623+
5. Create DestinationRule for mTLS configuration
624+
6. Verify all components are ready
625+
626+
**Troubleshooting:**
627+
628+
If tests fail, check:
629+
630+
```bash
631+
# Check Istio installation
632+
kubectl get pods -n istio-system
633+
634+
# Check sidecar injection
635+
kubectl get pods -n semantic-router -o jsonpath='{.items[*].spec.containers[*].name}'
636+
637+
# Check Istio resources
638+
kubectl get gateway,virtualservice,destinationrule -n semantic-router
639+
640+
# Check mTLS configuration
641+
kubectl get destinationrule semantic-router -n semantic-router -o yaml
642+
643+
# View Istio proxy logs
644+
kubectl logs -n semantic-router <pod-name> -c istio-proxy
645+
```
646+
647+
**Related Resources:**
648+
649+
- [Istio Documentation](https://istio.io/latest/docs/)
650+
- [Istio Traffic Management](https://istio.io/latest/docs/concepts/traffic-management/)
651+
- [Istio Security (mTLS)](https://istio.io/latest/docs/concepts/security/)
652+
- [Istio Observability](https://istio.io/latest/docs/concepts/observability/)

e2e/cmd/e2e/main.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,12 @@ import (
1212
aigateway "github.com/vllm-project/semantic-router/e2e/profiles/ai-gateway"
1313
aibrix "github.com/vllm-project/semantic-router/e2e/profiles/aibrix"
1414
dynamicconfig "github.com/vllm-project/semantic-router/e2e/profiles/dynamic-config"
15+
istio "github.com/vllm-project/semantic-router/e2e/profiles/istio"
1516

1617
// Import profiles to register test cases
1718
_ "github.com/vllm-project/semantic-router/e2e/profiles/ai-gateway"
1819
_ "github.com/vllm-project/semantic-router/e2e/profiles/aibrix"
20+
_ "github.com/vllm-project/semantic-router/e2e/profiles/istio"
1921
)
2022

2123
const version = "v1.0.0"
@@ -103,9 +105,8 @@ func getProfile(name string) (framework.Profile, error) {
103105
return dynamicconfig.NewProfile(), nil
104106
case "aibrix":
105107
return aibrix.NewProfile(), nil
106-
// Add more profiles here as they are implemented
107-
// case "istio":
108-
// return istio.NewProfile(), nil
108+
case "istio":
109+
return istio.NewProfile(), nil
109110
default:
110111
return nil, fmt.Errorf("unknown profile: %s", name)
111112
}

e2e/pkg/helpers/kubernetes.go

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,22 +38,28 @@ func CheckDeployment(ctx context.Context, client *kubernetes.Clientset, namespac
3838

3939
// GetEnvoyServiceName finds the Envoy service name in the envoy-gateway-system namespace
4040
// using label selectors to match the Gateway-owned service
41+
// Deprecated: Use GetServiceByLabelInNamespace for more flexibility
4142
func GetEnvoyServiceName(ctx context.Context, client *kubernetes.Clientset, labelSelector string, verbose bool) (string, error) {
42-
services, err := client.CoreV1().Services("envoy-gateway-system").List(ctx, metav1.ListOptions{
43+
return GetServiceByLabelInNamespace(ctx, client, "envoy-gateway-system", labelSelector, verbose)
44+
}
45+
46+
// GetServiceByLabelInNamespace finds a service by label selector in a specific namespace
47+
func GetServiceByLabelInNamespace(ctx context.Context, client *kubernetes.Clientset, namespace string, labelSelector string, verbose bool) (string, error) {
48+
services, err := client.CoreV1().Services(namespace).List(ctx, metav1.ListOptions{
4349
LabelSelector: labelSelector,
4450
})
4551
if err != nil {
4652
return "", fmt.Errorf("failed to list services with selector %s: %w", labelSelector, err)
4753
}
4854

4955
if len(services.Items) == 0 {
50-
return "", fmt.Errorf("no service found with selector %s in envoy-gateway-system namespace", labelSelector)
56+
return "", fmt.Errorf("no service found with selector %s in %s namespace", labelSelector, namespace)
5157
}
5258

5359
// Return the first matching service (should only be one)
5460
serviceName := services.Items[0].Name
5561
if verbose {
56-
fmt.Printf("[Helper] Found Envoy service: %s (matched by labels: %s)\n", serviceName, labelSelector)
62+
fmt.Printf("[Helper] Found service: %s (matched by labels: %s in namespace: %s)\n", serviceName, labelSelector, namespace)
5763
}
5864

5965
return serviceName, nil

0 commit comments

Comments
 (0)