Skip to content

Commit c4f26a8

Browse files
authored
refactor: move envoy gateway configmap to values.yaml files (#1410)
**Description** Envoy AI Gateway currently requires an envoy gateway config map to be overriden. The better approach is to pass the configmap as helm values during envoy gateway helm install. This PR makes changes to the documentation and installation files to do the same. **Related Issues/PRs (if applicable)** Fixes: #1191 --------- Signed-off-by: sailesh duddupudi <[email protected]>
1 parent 8ee7ae1 commit c4f26a8

File tree

16 files changed

+311
-197
lines changed

16 files changed

+311
-197
lines changed

examples/inference-pool/README.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# InferencePool Example
2+
3+
This example demonstrates how to use AI Gateway with the InferencePool feature, which enables intelligent request routing across multiple inference endpoints with load balancing and health checking capabilities.
4+
5+
## Files in This Directory
6+
7+
- **`envoy-gateway-values-addon.yaml`**: Envoy Gateway values addon for InferencePool support. Combine with `../../manifests/envoy-gateway-values.yaml`.
8+
- **`base.yaml`**: Complete example that includes Gateway, AIServiceBackend, InferencePool CRDs, and a sample application deployment.
9+
- **`aigwroute.yaml`**: Example AIGatewayRoute that uses InferencePool as a backend.
10+
- **`httproute.yaml`**: Example HTTPRoute for traditional HTTP routing to InferencePool endpoints.
11+
- **`with-annotations.yaml`**: Advanced example showing InferencePool with Kubernetes annotations for fine-grained control.
12+
13+
## Quick Start
14+
15+
1. Install Envoy Gateway with InferencePool support:
16+
17+
```bash
18+
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
19+
--version v0.0.0-latest \
20+
--namespace envoy-gateway-system \
21+
--create-namespace \
22+
-f ../../manifests/envoy-gateway-values.yaml \
23+
-f envoy-gateway-values-addon.yaml
24+
```
25+
26+
2. Deploy the example:
27+
28+
```bash
29+
kubectl apply -f base.yaml
30+
```
31+
32+
3. Test the setup:
33+
34+
```bash
35+
GATEWAY_HOST=$(kubectl get gateway/ai-gateway -o jsonpath='{.status.addresses[0].value}')
36+
curl -X POST "http://${GATEWAY_HOST}/v1/chat/completions" \
37+
-H "Content-Type: application/json" \
38+
-d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Hello!"}]}'
39+
```
40+
41+
### Combining with Other Features
42+
43+
You can easily combine InferencePool with other features using multiple `-f` flags:
44+
45+
```bash
46+
# InferencePool + rate limiting
47+
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
48+
--version v0.0.0-latest \
49+
--namespace envoy-gateway-system \
50+
--create-namespace \
51+
-f ../basic/envoy-gateway-values.yaml \
52+
-f ../token_ratelimit/envoy-gateway-values-addon.yaml \
53+
-f envoy-gateway-values-addon.yaml
54+
```
55+
56+
For detailed documentation, see the [AI Gateway documentation](https://gateway.envoyproxy.io/ai-gateway/).

examples/inference-pool/config.yaml

Lines changed: 0 additions & 73 deletions
This file was deleted.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Copyright Envoy AI Gateway Authors
2+
# SPDX-License-Identifier: Apache-2.0
3+
# The full text of the Apache license is available in the LICENSE file at
4+
# the root of the repo.
5+
6+
# This addon file adds InferencePool support to Envoy Gateway.
7+
# Use this in combination with the base envoy-gateway-values.yaml:
8+
#
9+
# helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
10+
# --version v0.0.0-latest \
11+
# --namespace envoy-gateway-system \
12+
# --create-namespace \
13+
# -f ../../manifests/envoy-gateway-values.yaml \
14+
# -f envoy-gateway-values-addon.yaml
15+
#
16+
# You can also combine with rate limiting:
17+
# -f ../../manifests/envoy-gateway-values.yaml \
18+
# -f ../token_ratelimit/envoy-gateway-values-addon.yaml \
19+
# -f envoy-gateway-values-addon.yaml
20+
21+
config:
22+
envoyGateway:
23+
extensionManager:
24+
# Enable InferencePool custom resource support
25+
backendResources:
26+
- group: inference.networking.k8s.io
27+
kind: InferencePool
28+
version: v1

examples/token_ratelimit/README.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,53 @@
1+
# Token based ratelimiting
2+
13
This example demonstrates how to use the token rate limit feature of the AI Gateway.
24
This utilizes the Global Rate Limit API of Envoy Gateway combined with the
35
AI Gateway's `llmRequestCosts` configuration to capture the consumed tokens
46
of each request.
7+
8+
## Files in This Directory
9+
10+
- **`envoy-gateway-values-addon.yaml`**: Envoy Gateway values addon for rate limiting. Combine with `../../manifests/envoy-gateway-values.yaml`.
11+
- **`redis.yaml`**: Redis deployment required for rate limiting. Deploy this before enabling rate limiting in Envoy Gateway.
12+
- **`token_ratelimit.yaml`**: Example AIGatewayRoute configuration that demonstrates token-based rate limiting.
13+
14+
## Quick Start
15+
16+
1. Install Envoy Gateway with base configuration + rate limiting addon:
17+
18+
```bash
19+
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
20+
--version v0.0.0-latest \
21+
--namespace envoy-gateway-system \
22+
--create-namespace \
23+
-f ../../manifests/envoy-gateway-values.yaml \
24+
-f envoy-gateway-values-addon.yaml
25+
```
26+
27+
2. Deploy Redis:
28+
29+
```bash
30+
kubectl apply -f redis.yaml
31+
```
32+
33+
3. Apply the token rate limit example:
34+
```bash
35+
kubectl apply -f token_ratelimit.yaml
36+
```
37+
38+
### Combining with Other Features
39+
40+
You can easily combine rate limiting with other features using multiple `-f` flags:
41+
42+
```bash
43+
# Rate limiting + InferencePool support
44+
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
45+
--version v0.0.0-latest \
46+
--namespace envoy-gateway-system \
47+
--create-namespace \
48+
-f ../basic/envoy-gateway-values.yaml \
49+
-f envoy-gateway-values-addon.yaml \
50+
-f ../inference-pool/envoy-gateway-values-addon.yaml
51+
```
52+
53+
For detailed documentation, see the [usage-based rate limiting guide](https://gateway.envoyproxy.io/ai-gateway/docs/capabilities/traffic/usage-based-ratelimiting).
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Copyright Envoy AI Gateway Authors
2+
# SPDX-License-Identifier: Apache-2.0
3+
# The full text of the Apache license is available in the LICENSE file at
4+
# the root of the repo.
5+
6+
# This addon file adds rate limiting configuration to Envoy Gateway.
7+
# Use this in combination with the base envoy-gateway-values.yaml:
8+
#
9+
# helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
10+
# --version v0.0.0-latest \
11+
# --namespace envoy-gateway-system \
12+
# --create-namespace \
13+
# -f ../../manifests/envoy-gateway-values.yaml \
14+
# -f envoy-gateway-values-addon.yaml
15+
#
16+
# Prerequisites:
17+
# - Redis must be deployed (see redis.yaml in this directory)
18+
19+
config:
20+
envoyGateway:
21+
provider:
22+
kubernetes:
23+
rateLimitDeployment:
24+
patch:
25+
type: StrategicMerge
26+
value:
27+
spec:
28+
template:
29+
spec:
30+
containers:
31+
- imagePullPolicy: IfNotPresent
32+
name: envoy-ratelimit
33+
image: docker.io/envoyproxy/ratelimit:60d8e81b
34+
rateLimit:
35+
backend:
36+
type: Redis
37+
redis:
38+
# Update this URL to match your Redis service location
39+
# This assumes Redis is deployed using the redis.yaml in this directory
40+
url: redis.redis-system.svc.cluster.local:6379

examples/token_ratelimit/redis.yaml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@
44
# the root of the repo.
55

66
# This is a simple example of a Redis deployment that is used
7-
# by the default Envoy Gateway setting in config.yaml. TODO: modify this comment when https://github.com/envoyproxy/ai-gateway/issues/1191 is fixed.
7+
# by the Envoy Gateway rate limiting feature.
8+
#
9+
# This is only necessary if you want to use the rate limit feature.
10+
# When enabling rate limiting, you need to configure Envoy Gateway to point to this Redis instance.
11+
# See the envoy-gateway-values-addon.yaml file in this directory for the complete configuration example.
812
---
913
kind: Namespace
1014
apiVersion: v1

manifests/envoy-gateway-config/config.yaml

Lines changed: 0 additions & 69 deletions
This file was deleted.
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Copyright Envoy AI Gateway Authors
2+
# SPDX-License-Identifier: Apache-2.0
3+
# The full text of the Apache license is available in the LICENSE file at
4+
# the root of the repo.
5+
6+
# This file contains the base Envoy Gateway helm values needed for AI Gateway integration.
7+
# This is the minimal configuration that all AI Gateway deployments need.
8+
#
9+
# Use this file when installing Envoy Gateway with:
10+
# helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
11+
# --version v0.0.0-latest \
12+
# --namespace envoy-gateway-system \
13+
# --create-namespace \
14+
# -f envoy-gateway-values.yaml
15+
#
16+
# For additional features, combine with addon values files:
17+
# -f envoy-gateway-values.yaml -f examples/token_ratelimit/envoy-gateway-values-addon.yaml
18+
# -f envoy-gateway-values.yaml -f examples/inference-pool/envoy-gateway-values-addon.yaml
19+
20+
config:
21+
envoyGateway:
22+
gateway:
23+
controllerName: gateway.envoyproxy.io/gatewayclass-controller
24+
logging:
25+
level:
26+
default: info
27+
provider:
28+
type: Kubernetes
29+
extensionApis:
30+
# Not strictly required, but recommended for backward/future compatibility.
31+
enableEnvoyPatchPolicy: true
32+
# Required: Enable Backend API for AI service backends.
33+
enableBackend: true
34+
# Required: AI Gateway needs to fine-tune xDS resources generated by Envoy Gateway.
35+
extensionManager:
36+
hooks:
37+
xdsTranslator:
38+
translation:
39+
listener:
40+
includeAll: true
41+
route:
42+
includeAll: true
43+
cluster:
44+
includeAll: true
45+
secret:
46+
includeAll: true
47+
post:
48+
- Translation
49+
- Cluster
50+
- Route
51+
service:
52+
fqdn:
53+
# IMPORTANT: Update this to match your AI Gateway controller service
54+
# Format: <service-name>.<namespace>.svc.cluster.local
55+
# Default if you followed the installation steps above:
56+
hostname: ai-gateway-controller.envoy-ai-gateway-system.svc.cluster.local
57+
port: 1063

0 commit comments

Comments
 (0)