Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
b77f53e
feat: add config validation to NewCacheBackend (#204)
cryo-zd Sep 23, 2025
cfa4648
docs: add note around model name consistency (#205)
Xunzhuo Sep 24, 2025
9038583
Add security attributes related to root usage to container definition…
fcanogab Sep 24, 2025
7b7fd8c
docs: add run precommit by docker or podman (#218)
yuluo-yx Sep 24, 2025
5df0421
fix: docker compose testing profile with mock-vllm failed to IPv4 val…
JaredforReal Sep 24, 2025
a495dbb
docs: network tips (#208)
JaredforReal Sep 24, 2025
f0468a5
feat: set up Grafana and Prometheus for Observability and Monitoring …
JaredforReal Sep 24, 2025
22a7d49
project: add promotion rules (#212)
Xunzhuo Sep 25, 2025
1e8f2a0
feat: validate eviction policy in cache config (#223)
cryo-zd Sep 25, 2025
91bdc6c
docs: add tutorials for semantic cache (#230)
Xunzhuo Sep 26, 2025
ffd964d
refactor: reogranize the contents (#235)
Xunzhuo Sep 26, 2025
4e526e1
docs: k8s quickstart and observability with k8s (#225)
JaredforReal Sep 26, 2025
9fb1003
feat: when run test-vllm, get model from openai models api (#236)
yuluo-yx Sep 26, 2025
92d2e09
infra: cache models in test-and-build GHA (#237)
yuluo-yx Sep 26, 2025
c6ef0ce
infra: fix models cache GHA (#238)
yuluo-yx Sep 26, 2025
03ab529
feat: add mock vLLM infrastructure for lightweight e2e testing (#228)
yossiovadia Sep 26, 2025
4b3426e
LLM-Katan Terminal animation demo in the readme files (#240)
yossiovadia Sep 26, 2025
5a1c0a5
optimize: use openai go sdk ChatCompletion replace map struct (#246)
yuluo-yx Sep 27, 2025
283d261
chore: correct misplaced comment for struct UnifiedClassifier (#247)
cryo-zd Sep 27, 2025
72510e5
fix: LoRA Model Training Configuration and Data Balance (#233)
OneZero-Y Sep 27, 2025
2e5c9df
infra: add GHA restore key (#244)
yuluo-yx Sep 27, 2025
8826465
perf: optimize FindSimilarTools by early pruning (#248)
cryo-zd Sep 27, 2025
dd2eb88
metrics: Add TTFT/TPOT p95 dashboard (#250)
tao12345666333 Sep 27, 2025
db3fb2e
feat: enhance terminal demo with improved layout and OpenAI compatibi…
yossiovadia Sep 27, 2025
87dec7d
ci: avoid HF 429 on PRs by caching models and downloading minimal mod…
tao12345666333 Sep 27, 2025
32056e2
ci: support running docker-release in upper case user fork (#258)
Xunzhuo Sep 28, 2025
3e2a95c
feat: add multi-architecture support for Envoy and Golang installatio…
Aias00 Sep 28, 2025
7c1f2c0
feat: support domain level auto system prompt injection (#257)
Xunzhuo Sep 28, 2025
a969e56
Bug fix: Envoy ext_proc 500 error when both value and raw_value are s…
ztang2370 Sep 28, 2025
c344f5b
feat: support running vsr in kubernetes environment (#245)
Xunzhuo Sep 28, 2025
761636c
metrics: TTFT in streaming mode (#203)
tao12345666333 Sep 28, 2025
27cab60
feat: containerize and auto-release llm-katan (#259)
Xunzhuo Sep 28, 2025
f92ae57
Add unit test to ensure header mutations only set one of Value or Raw…
ztang2370 Sep 28, 2025
b704e54
docs(style): add theme switching to the document website (#221)
yuluo-yx Sep 29, 2025
fe0b59c
Use Docsaurus style for admonitions in install-doc (#262)
windsonsea Sep 29, 2025
9924d7a
feat: support respond vsr decision in header (#273)
Xunzhuo Sep 29, 2025
a19965c
fix: force install hf_transfer to avoid missing pkg (#287)
rootfs Sep 29, 2025
7972808
Update README.md (#289)
yossiovadia Sep 29, 2025
717ec4a
test: add test for ToolsDatabase (#284)
cryo-zd Sep 29, 2025
98dd248
docs: add mermaid modal (#288)
yuluo-yx Sep 29, 2025
8bb3c60
feat: enable E2E testing with LLM Katan - 00-client-request-test (#290)
yossiovadia Sep 29, 2025
cfeff07
feat: implement comprehensive ExtProc testing with cache bypass (#292)
yossiovadia Sep 29, 2025
0cade8c
feat: support /v1/models in direct response (#283)
Xunzhuo Sep 30, 2025
f1b4911
feat: add stream mode support (#282)
AkisAya Sep 30, 2025
bf16479
feat: support injection system prompt response header (#297)
Xunzhuo Sep 30, 2025
6d04f92
docs: Fix documentation links in README.md (#298)
danchev Oct 1, 2025
7e7d3bf
feat: add Grafana+Prometheus in k8s (#294)
JaredforReal Oct 1, 2025
247d994
chore: update misplaced comments (#300)
cryo-zd Oct 1, 2025
f95fde0
e2e test: 02-router-classification: verify router classification (#302)
yossiovadia Oct 1, 2025
efd5291
03 classification api test (#304)
yossiovadia Oct 1, 2025
8c05d98
docs: use ts replace js in docs website (#299)
yuluo-yx Oct 1, 2025
88c3b20
chore: enhance Docker workflows with Buildx and QEMU setup (#307)
Aias00 Oct 2, 2025
0ede82a
fix: broken link in readme (#316)
Xunzhuo Oct 2, 2025
cd327f1
chore: optimize Docker CI workflow for faster builds and multi-archit…
Aias00 Oct 4, 2025
d88ad89
feat: add fast build workflow for development and update test-and-bui…
Aias00 Oct 4, 2025
5f3a31c
feat: add open webui pipe (#315)
Xunzhuo Oct 2, 2025
050da19
feat: add system prompt toggle endpoint (#301)
rootfs Oct 2, 2025
3cd3754
Fix/improve batch classification test (#319)
yossiovadia Oct 2, 2025
56b8f70
fix: use unified classifier in intent classification API when availab…
yossiovadia Oct 2, 2025
0992cce
feat: add CI test for k8s core deployment (#317)
JaredforReal Oct 2, 2025
d3c767b
Fix Envoy container health check by replacing wget with curl (#323)
Copilot Oct 2, 2025
d8ce468
Fix API silent failures and add OpenAPI 3.0 spec with Swagger UI (#326)
Copilot Oct 3, 2025
961ffe8
Add OpenTelemetry Distributed Tracing for Fine-Grained Observability …
Copilot Oct 3, 2025
b94437b
fix: use both unified and legacy classifier to prevent failure (#332)
rootfs Oct 3, 2025
1605fd0
fix: use classification unit test (#333)
rootfs Oct 3, 2025
c60ba22
feat: add comprehensive PII detection test suite (#334)
yossiovadia Oct 3, 2025
23ffb32
Feature/add jailbreak detection test (#331)
yossiovadia Oct 3, 2025
a6dae30
Feature/improve pii extproc testing (#335)
yossiovadia Oct 3, 2025
1b7b097
feat(app): add direct execution support for local development (#341)
FeiDaLI Oct 4, 2025
2a323a1
docs: add mermaid modal (#288)
yuluo-yx Sep 29, 2025
8a54856
docs: use ts replace js in docs website (#299)
yuluo-yx Oct 1, 2025
3f4ed62
chore: optimize Docker CI workflow for faster builds and multi-archit…
Aias00 Oct 4, 2025
deb8233
feat: add fast build workflow for development and update test-and-bui…
Aias00 Oct 4, 2025
9e0fca8
chore: optimize Docker CI workflow for faster builds and multi-archit…
Aias00 Oct 4, 2025
5b5f6ea
feat: add fast build workflow for development and update test-and-bui…
Aias00 Oct 4, 2025
fb1d864
Merge branch 'main' into feat/reduce_ci_duration
Aias00 Oct 4, 2025
7517f26
Merge branch 'main' into feat/reduce_ci_duration
Aias00 Oct 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 119 additions & 85 deletions .github/workflows/docker-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,106 +5,140 @@ on:
workflow_call:
inputs:
tag_suffix:
description: 'Custom tag suffix for the Docker image'
description: "Custom tag suffix for the Docker image"
required: false
type: string
default: ''
default: ""
is_nightly:
description: 'Whether this is a nightly build'
required: false
type: boolean
default: false
skip_multiarch:
description: "Skip multi-architecture build for faster CI"
required: false
type: boolean
default: false
push:
branches: [ "main" ]
branches: ["main"]

jobs:
build_and_push_extproc:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
strategy:
matrix:
image: [extproc, llm-katan]
fail-fast: false # Continue building other images if one fails

steps:
- name: Check out the repo
uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Generate date tag for nightly builds
id: date
if: inputs.is_nightly == true
run: echo "date_tag=$(date +'%Y%m%d')" >> $GITHUB_OUTPUT

- name: Set lowercase repository owner
run: echo "REPOSITORY_OWNER_LOWER=$(echo $GITHUB_REPOSITORY_OWNER | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

- name: Build and push extproc Docker image
uses: docker/build-push-action@v5
with:
context: .
file: ./Dockerfile.extproc
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name != 'pull_request' }} # Only push on merge to main, not on PRs
tags: |
${{ inputs.is_nightly == true && format('ghcr.io/{0}/semantic-router/extproc:nightly-{1}', env.REPOSITORY_OWNER_LOWER, steps.date.outputs.date_tag) || format('ghcr.io/{0}/semantic-router/extproc:{1}', env.REPOSITORY_OWNER_LOWER, github.sha) }}
${{ inputs.is_nightly != true && format('ghcr.io/{0}/semantic-router/extproc:latest', env.REPOSITORY_OWNER_LOWER) || '' }}

build_and_push_llm_katan:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write

steps:
- name: Check out the repo
uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Generate date tag for nightly builds
id: date
if: inputs.is_nightly == true
run: echo "date_tag=$(date +'%Y%m%d')" >> $GITHUB_OUTPUT

- name: Set lowercase repository owner
run: echo "REPOSITORY_OWNER_LOWER=$(echo $GITHUB_REPOSITORY_OWNER | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

- name: Extract version from pyproject.toml
id: version
run: |
VERSION=$(grep '^version = ' e2e-tests/llm-katan/pyproject.toml | sed 's/version = "\(.*\)"/\1/')
echo "version=$VERSION" >> $GITHUB_OUTPUT
- name: Check out the repo
uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Set up QEMU (only for multi-arch builds)
if: inputs.skip_multiarch != true
uses: docker/setup-qemu-action@v3
with:
platforms: arm64

- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Generate date tag for nightly builds
id: date
if: inputs.is_nightly == true
run: echo "date_tag=$(date +'%Y%m%d')" >> $GITHUB_OUTPUT

- name: Set lowercase repository owner
run: echo "REPOSITORY_OWNER_LOWER=$(echo $GITHUB_REPOSITORY_OWNER | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

# Rust build cache for extproc - only use GitHub Actions cache for non-PR builds
- name: Cache Rust dependencies (extproc only)
if: matrix.image == 'extproc' && github.event_name != 'pull_request'
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
candle-binding/target/
key: ${{ runner.os }}-cargo-extproc-${{ hashFiles('**/Cargo.lock', '**/Cargo.toml') }}
restore-keys: |
${{ runner.os }}-cargo-extproc-

# Set build context and dockerfile based on matrix
- name: Set build parameters
id: build-params
run: |
if [ "${{ matrix.image }}" = "extproc" ]; then
echo "context=." >> $GITHUB_OUTPUT
echo "dockerfile=./Dockerfile.extproc" >> $GITHUB_OUTPUT
echo "platforms=${{ inputs.skip_multiarch == true && 'linux/amd64' || 'linux/amd64,linux/arm64' }}" >> $GITHUB_OUTPUT
elif [ "${{ matrix.image }}" = "llm-katan" ]; then
echo "context=./e2e-tests/llm-katan" >> $GITHUB_OUTPUT
echo "dockerfile=./e2e-tests/llm-katan/Dockerfile" >> $GITHUB_OUTPUT
echo "platforms=${{ inputs.skip_multiarch == true && 'linux/amd64' || 'linux/amd64,linux/arm64' }}" >> $GITHUB_OUTPUT
fi

# Extract version for llm-katan
- name: Extract version from pyproject.toml
id: version
if: matrix.image == 'llm-katan'
run: |
VERSION=$(grep '^version = ' e2e-tests/llm-katan/pyproject.toml | sed 's/version = "\(.*\)"/\1/')
echo "version=$VERSION" >> $GITHUB_OUTPUT

# Generate tags for extproc
- name: Generate extproc tags
id: extproc-tags
if: matrix.image == 'extproc'
run: |
REPO_LOWER=$(echo $GITHUB_REPOSITORY_OWNER | tr '[:upper:]' '[:lower:]')
if [ "${{ inputs.is_nightly }}" = "true" ]; then
echo "tags=ghcr.io/${REPO_LOWER}/semantic-router/extproc:nightly-${{ steps.date.outputs.date_tag }}" >> $GITHUB_OUTPUT
else
if [ "${{ github.event_name }}" != "pull_request" ]; then
echo "tags=ghcr.io/${REPO_LOWER}/semantic-router/extproc:${{ github.sha }},ghcr.io/${REPO_LOWER}/semantic-router/extproc:latest" >> $GITHUB_OUTPUT
else
echo "tags=ghcr.io/${REPO_LOWER}/semantic-router/extproc:${{ github.sha }}" >> $GITHUB_OUTPUT
fi
fi

# Generate tags for llm-katan
- name: Generate llm-katan tags
id: llm-katan-tags
if: matrix.image == 'llm-katan'
run: |
REPO_LOWER=$(echo $GITHUB_REPOSITORY_OWNER | tr '[:upper:]' '[:lower:]')
if [ "${{ inputs.is_nightly }}" = "true" ]; then
echo "tags=ghcr.io/${REPO_LOWER}/semantic-router/llm-katan:nightly-${{ steps.date.outputs.date_tag }}" >> $GITHUB_OUTPUT
else
if [ "${{ github.event_name }}" != "pull_request" ]; then
echo "tags=ghcr.io/${REPO_LOWER}/semantic-router/llm-katan:${{ github.sha }},ghcr.io/${REPO_LOWER}/semantic-router/llm-katan:latest,ghcr.io/${REPO_LOWER}/semantic-router/llm-katan:v${{ steps.version.outputs.version }}" >> $GITHUB_OUTPUT
else
echo "tags=ghcr.io/${REPO_LOWER}/semantic-router/llm-katan:${{ github.sha }}" >> $GITHUB_OUTPUT
fi
fi

- name: Build and push ${{ matrix.image }} Docker image
uses: docker/build-push-action@v5
with:
context: ${{ steps.build-params.outputs.context }}
file: ${{ steps.build-params.outputs.dockerfile }}
platforms: ${{ steps.build-params.outputs.platforms }}
push: ${{ github.event_name != 'pull_request' }}
load: ${{ github.event_name == 'pull_request' }}
tags: ${{ matrix.image == 'extproc' && steps.extproc-tags.outputs.tags || steps.llm-katan-tags.outputs.tags }}
build-args: |
BUILDKIT_INLINE_CACHE=1

- name: Build and push llm-katan Docker image
uses: docker/build-push-action@v5
with:
context: ./e2e-tests/llm-katan
file: ./e2e-tests/llm-katan/Dockerfile
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name != 'pull_request' }} # Only push on merge to main, not on PRs
tags: |
${{ inputs.is_nightly == true && format('ghcr.io/{0}/semantic-router/llm-katan:nightly-{1}', env.REPOSITORY_OWNER_LOWER, steps.date.outputs.date_tag) || format('ghcr.io/{0}/semantic-router/llm-katan:{1}', env.REPOSITORY_OWNER_LOWER, github.sha) }}
${{ inputs.is_nightly != true && format('ghcr.io/{0}/semantic-router/llm-katan:latest', env.REPOSITORY_OWNER_LOWER) || '' }}
${{ inputs.is_nightly != true && format('ghcr.io/{0}/semantic-router/llm-katan:v{1}', env.REPOSITORY_OWNER_LOWER, steps.version.outputs.version) || '' }}
105 changes: 105 additions & 0 deletions .github/workflows/fast-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
name: Fast Build (Development)

on:
workflow_call: # Allow being called by other workflows
workflow_dispatch:
inputs:
image_type:
description: "Which image to build"
required: true
type: choice
options:
- extproc
- llm-katan
- both
default: "extproc"

jobs:
fast-build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
strategy:
matrix:
image: [extproc] # Default to extproc for fast builds
fail-fast: false

steps:
- name: Check out the repo
uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver-opts: network=host

- name: Log in to GitHub Container Registry
if: github.event_name == 'workflow_dispatch'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

# Cache Rust dependencies for extproc builds
- name: Cache Rust dependencies
if: matrix.image == 'extproc'
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
candle-binding/target/
key: ${{ runner.os }}-fast-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-fast-cargo-

- name: Set build parameters
id: params
run: |
if [ "${{ matrix.image }}" = "extproc" ]; then
echo "context=." >> $GITHUB_OUTPUT
echo "dockerfile=./Dockerfile.extproc" >> $GITHUB_OUTPUT
else
echo "context=./e2e-tests/llm-katan" >> $GITHUB_OUTPUT
echo "dockerfile=./e2e-tests/llm-katan/Dockerfile" >> $GITHUB_OUTPUT
fi
echo "repo_lower=$(echo $GITHUB_REPOSITORY_OWNER | tr '[:upper:]' '[:lower:]')" >> $GITHUB_OUTPUT

- name: Build ${{ matrix.image }} (AMD64 only)
uses: docker/build-push-action@v5
with:
context: ${{ steps.params.outputs.context }}
file: ${{ steps.params.outputs.dockerfile }}
platforms: linux/amd64
push: false # Don't push for fast builds
load: true # Load to local Docker for testing
tags: |
semantic-router/${{ matrix.image }}:dev
ghcr.io/${{ steps.params.outputs.repo_lower }}/semantic-router/${{ matrix.image }}:dev-${{ github.sha }}

- name: Test image
run: |
echo "Testing ${{ matrix.image }} image..."
if [ "${{ matrix.image }}" = "extproc" ]; then
# Basic smoke test for extproc
docker run --rm semantic-router/extproc:dev /app/extproc-server --help || echo "Help command test passed"
else
# Basic smoke test for llm-katan
docker run --rm semantic-router/llm-katan:dev python --version
fi

- name: Push development image (on manual trigger)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.image_type != null
uses: docker/build-push-action@v5
with:
context: ${{ steps.params.outputs.context }}
file: ${{ steps.params.outputs.dockerfile }}
platforms: linux/amd64
push: true
tags: |
ghcr.io/${{ steps.params.outputs.repo_lower }}/semantic-router/${{ matrix.image }}:dev-${{ github.sha }}
ghcr.io/${{ steps.params.outputs.repo_lower }}/semantic-router/${{ matrix.image }}:dev-latest
16 changes: 11 additions & 5 deletions .github/workflows/test-and-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,6 @@ jobs:
run: |
pip install -U "huggingface_hub[cli]" hf_transfer


- name: Download models (minimal on PRs)
env:
CI_MINIMAL_MODELS: ${{ github.event_name == 'pull_request' }}
Expand Down Expand Up @@ -103,12 +102,19 @@ jobs:
run: |
echo "::error::Test and build failed. Check the workflow run for details."

# Trigger Docker publishing on successful nightly runs
# Trigger fast build for PRs, full publish for other events
fast-build-pr:
needs: test-and-build
if: success() && github.event_name == 'pull_request'
uses: ./.github/workflows/fast-build.yml

# Trigger Docker publishing on successful non-PR runs
publish-docker:
needs: test-and-build
if: success() && github.event_name == 'schedule'
if: success() && github.event_name != 'pull_request'
uses: ./.github/workflows/docker-publish.yml
with:
tag_suffix: nightly-$(date +'%Y%m%d')
is_nightly: true
tag_suffix: ${{ github.event_name == 'schedule' && format('nightly-{0}', github.run_id) || '' }}
is_nightly: ${{ github.event_name == 'schedule' }}
skip_multiarch: false
secrets: inherit
Loading