Skip to content

Conversation

@noalimoy
Copy link
Contributor

@noalimoy noalimoy commented Dec 8, 2025

Summary

This PR optimizes the Docker Compose integration test workflow by replacing the heavy full-stack quickstart.sh deployment with a lightweight CI-specific configuration.

Fixes #777

Problem

The current workflow .github/workflows/integration-test-docker.yml relies on scripts/quickstart.sh, which triggers a full docker compose launch with 11+ services. This results in:

  • Pulling many heavy container images from multiple remote registries
  • No persistent Docker cache on GitHub-hosted runners (all images fetched every run)
  • Frequent stalls or timeouts during docker pull
  • Unstable and slow CI runs

Solution

Introduce a minimal docker-compose.ci.yml that strips nonessential services and replaces UI-based validation with simple curl health checks.

Changes

File Change
deploy/docker-compose/docker-compose.ci.yml New - Minimal CI compose with only 3 essential services
tools/make/docker.mk Added CI-specific make targets
.github/workflows/integration-test-docker.yml Updated to use CI compose instead of quickstart.sh

Before vs After

Before (Full Stack - 11+ services)

semantic-router, envoy, mock-vllm, jaeger, prometheus,
grafana, openwebui, chat-ui, pipelines, mongo,
llm-katan, dashboard (built from source)
  • Timeout: 30 minutes
  • Images: Multiple registries (Docker Hub + GHCR)
  • Validation: UI-based tests

After (CI Minimal - 3 services)

semantic-router, envoy, llm-katan
  • Timeout: 20 minutes
  • Images: Primarily GHCR (2/3 services)
  • Validation: Simple curl health checks

Removed Services (Not Needed for CI)

Service Reason for Removal
grafana UI monitoring - not needed for CI testing
prometheus Metrics collection - not needed for CI
jaeger Tracing - not needed for CI
openwebui Chat UI - not needed for CI
chat-ui Chat UI - not needed for CI
pipelines UI pipelines - not needed for CI
mongo Database for chat-ui - not needed for CI
dashboard Admin UI - not needed for CI

New Make Targets

make docker-compose-up-ci      # Start minimal CI services
make docker-compose-down-ci    # Stop CI services
make docker-compose-logs-ci    # View logs
make docker-compose-ps-ci      # Check status

Test Results

CI Workflow Passed: GitHub Actions Run #5

Test Status
semantic-router health (localhost:8080/health) ✅ Pass
envoy proxy ready (localhost:19000/ready) ✅ Pass
llm-katan health (localhost:8002/health) ✅ Pass
Chat completions routing (localhost:8801/v1/chat/completions) ✅ Pass
Total Duration ~3 minutes

Testing

  • Local testing with make docker-compose-up-ci
  • All curl health checks pass
  • Chat completions routing works end-to-end
  • CI workflow passes on GitHub Actions

Replace heavy quickstart.sh full-stack deployment with lightweight
CI-specific docker-compose configuration.

Changes:
- Add docker-compose.ci.yml with only 3 essential services
  (semantic-router, envoy, llm-katan) instead of 11+ services
- Remove UI services (grafana, openwebui, chat-ui, prometheus,
  jaeger, dashboard, mongo, pipelines) - not needed for CI testing
- Replace UI-based validation with simple curl health checks
- Add make targets: docker-compose-{up,down,logs,ps}-ci
- Reduce CI timeout from 30 to 20 minutes

This fixes frequent CI timeouts caused by pulling many heavy
container images from multiple registries on GitHub-hosted runners
which have no persistent Docker cache.

Fixes: vllm-project#777
Signed-off-by: Noa Limoy <[email protected]>
@netlify
Copy link

netlify bot commented Dec 8, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit a444ace
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/6936e7a563bc3d0008b8fea3
😎 Deploy Preview https://deploy-preview-786--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Dec 8, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • .github/workflows/integration-test-docker.yml

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/docker-compose/docker-compose.ci.yml

📁 tools

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

  • tools/make/docker.mk

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@rootfs rootfs merged commit 6bb6ebc into vllm-project:main Dec 8, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Workflow: Excessive docker pull operations in Actions cause timeouts and unstable CI runs

4 participants