This directory contains deployment resources for Envoy Gateway (as a modern replacement for nginx ingress) and AI Gateway extension to enable AI capabilities for OpenObserve SRE Agent. Envoy Gateway provides HTTP routing, load balancing, and traffic management, while AI Gateway adds LLM provider integration and AI-specific features.
This repository supports AWS EKS and Azure AKS deployments. Cloud-specific load balancer configurations are included as commented annotations in the base files - simply uncomment the annotations for your cloud provider before deploying.
.
├── README.md
├── base/ # Multi-cloud base configurations
│ ├── kustomization.yaml
│ ├── gatewayclass.yaml
│ ├── gateway.yaml
│ ├── gateway-config.yaml # Includes commented AWS Bedrock IRSA config
│ ├── envoyproxy.yaml # Includes commented AWS & Azure annotations
│ ├── traffic-policies.yaml
│ ├── letsencrypt-issuer.yaml
│ ├── aiProviders/ # LLM provider configurations
│ │ ├── providers/ # Individual provider configs
│ │ ├── cloud-providers.yaml
│ │ ├── major-providers.yaml
│ │ └── alternative-providers.yaml
│ └── policies/ # Traffic and security policies
├── examples/
│ ├── aws-bedrock-irsa.yaml
│ └── gcp-vertex-ai.yaml
├── monitoring/ # Monitoring configs
└── docs/ # Additional documentation
- Kubernetes cluster with kubectl access
- AWS EKS cluster for AWS deployment
- Azure AKS cluster for Azure deployment
- Envoy Gateway operator installed (see Installation section)
- Envoy AI Gateway CRDs installed
- Namespace created (e.g.,
envoy-gateway-system) - kubectl installed
# Step 1: Install Envoy Gateway (includes Envoy CRDs)
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
--version v0.0.0-latest \
--namespace envoy-gateway-system \
--create-namespace \
-f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml
# Step 2: Wait for Envoy Gateway to be ready
kubectl wait --timeout=2m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available
# Step 3: Install AI Gateway CRDs
helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm \
--version v0.0.0-latest \
--namespace envoy-ai-gateway-system \
--create-namespace
# Step 4: Install AI Gateway Resources
helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \
--version v0.0.0-latest \
--namespace envoy-ai-gateway-system \
--create-namespace
# Step 5: Wait for AI Gateway controller to be ready
kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available# Install Envoy Gateway with additional features
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
--version v0.0.0-latest \
--namespace envoy-gateway-system \
--create-namespace \
-f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml \
-f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/token_ratelimit/envoy-gateway-values-addon.yaml \
-f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/inference-pool/envoy-gateway-values-addon.yaml
# Then continue with steps 2-5 from Basic Installation# Check Envoy Gateway status
kubectl get deployments -n envoy-gateway-system
# Check AI Gateway status
kubectl get deployments -n envoy-ai-gateway-system
# Check all CRDs are installed
kubectl get crds | grep -E "(envoyproxy|aigateway)"Before deploying, edit base/envoyproxy.yaml to uncomment the annotations for your cloud provider:
- Open base/envoyproxy.yaml
- Find the
# ==================== AWS EKS Configuration ====================sections - Uncomment the AWS annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true" service.beta.kubernetes.io/aws-load-balancer-type: "nlb" service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing" # for external
- (Optional) For SSL/TLS termination, uncomment ACM certificate annotations
- (Optional) For AWS Bedrock IRSA, uncomment the configuration in base/gateway-config.yaml
# Set your values
RESOURCE_GROUP="your-resource-group"
VNET_NAME="your-vnet-name"
AKS_CLUSTER="your-aks-cluster-name"
# Get the AKS managed identity
MANAGED_IDENTITY=$(az aks show \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER \
--query identityProfile.kubeletidentity.clientId -o tsv)
# Get your subscription ID
SUBSCRIPTION_ID=$(az account show --query id -o tsv)
# Grant Network Contributor role on the VNet
az role assignment create \
--assignee $MANAGED_IDENTITY \
--role "Network Contributor" \
--scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Network/virtualNetworks/$VNET_NAME"Then configure the manifests:
- Open base/envoyproxy.yaml
- Find the
# ==================== Azure AKS Configuration ====================sections - Uncomment the Azure annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true" # for internal
- (Optional) For static IPs or DNS labels, uncomment those annotations
- (Note) Azure Load Balancer does NOT support SSL/TLS termination - terminate SSL in Envoy or use Application Gateway
# Deploy all resources
kubectl apply -k base/
# Verify deployment
kubectl get gateway,envoyproxy,aiservicebackend -n envoy-gateway-system# Step 1: Core Gateway components
kubectl apply -f base/gatewayclass.yaml
kubectl apply -f base/envoyproxy.yaml -n envoy-gateway-system
kubectl apply -f base/gateway-config.yaml -n envoy-gateway-system
kubectl apply -f base/gateway.yaml -n envoy-gateway-system
# Step 2: Deploy LLM provider backends
kubectl apply -f base/aiProviders/major-providers.yaml -n envoy-gateway-system
# Step 3: Traffic policies (optional but recommended)
kubectl apply -f base/traffic-policies.yaml -n envoy-gateway-system
# Step 4: TLS/SSL certificate (optional, for HTTPS)
kubectl apply -f base/letsencrypt-issuer.yaml -n envoy-gateway-system# Check Gateway status
kubectl get gateway -n envoy-gateway-system
# Check EnvoyProxy resources
kubectl get envoyproxy -n envoy-gateway-system
# Check if Envoy pods are created
kubectl get pods -n envoy-gateway-system
# Check backends
kubectl get aiservicebackend -n envoy-gateway-system
# Check services and load balancers
kubectl get svc -n envoy-gateway-systemFor AWS EKS:
# Get internal load balancer endpoint
kubectl get svc -n envoy-gateway-system envoy-eg-internal-proxy \
-o jsonpath='{.status.loadBalancer.ingress[0].hostname}'
# Get external load balancer endpoint
kubectl get svc -n envoy-gateway-system envoy-eg-external-proxy \
-o jsonpath='{.status.loadBalancer.ingress[0].hostname}'For Azure AKS:
# Get internal load balancer IP
kubectl get svc -n envoy-gateway-system envoy-eg-internal-proxy \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}'
# Get external load balancer IP
kubectl get svc -n envoy-gateway-system envoy-eg-external-proxy \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}'Update your SRE agent deployment with:
env:
- name: O2_AI_GATEWAY_ENABLED
value: "true"
- name: O2_AI_GATEWAY_URL
value: "http://ai-gateway.envoy-gateway-system.svc.cluster.local:80"
- name: O2_AI_MODEL
value: "gpt-4o" # or any supported model
- name: O2_AI_API_KEY
valueFrom:
secretKeyRef:
name: openobserve-ai-key
key: apiKey| Provider | Model Example | Authentication |
|---|---|---|
| OpenAI | gpt-4o |
API Key |
| Anthropic | claude-sonnet-4-5-20250929 |
API Key |
| AWS Bedrock | anthropic.claude-3-sonnet-* |
IRSA (IAM Role for Service Accounts) |
| DeepSeek | deepseek-chat |
API Key |
| Grok (xAI) | grok-4-1-fast-reasoning |
API Key |
| Azure OpenAI | azure-gpt-4 |
API Key or Azure AD Service Principal |
| Gemini (text) | gemini-3-flash-preview |
API Key |
| Google Vertex AI | Any Gemini model | GCP Service Account JSON |
| Feature | AWS EKS | Azure AKS |
|---|---|---|
| Load Balancer Type | Network Load Balancer (NLB) | Azure Load Balancer (Standard SKU) |
| Internal LB Annotation | aws-load-balancer-internal: "true" |
azure-load-balancer-internal: "true" |
| LB Address Type | DNS hostname | IP address |
| SSL/TLS Termination | ACM certificates | Azure Key Vault or cert-manager |
| Cloud AI Service | AWS Bedrock with IRSA | Azure OpenAI with Managed Identity |
| Identity Management | IAM Roles for Service Accounts (IRSA) | Workload Identity |
| Cross-zone/region | Cross-zone load balancing | Zone redundancy (automatic) |
Update the namespace in the kustomization file:
# Edit base/kustomization.yaml
sed -i 's/namespace: envoy-gateway-system/namespace: your-namespace/g' base/kustomization.yamlAll cloud-specific load balancer annotations are included as comments in base/envoyproxy.yaml. Simply uncomment the annotations you need for your cloud provider:
- AWS EKS: NLB annotations, ACM certificates, cross-zone load balancing
- Azure AKS: Azure Load Balancer annotations, static IPs, DNS labels
Check EnvoyProxy status:
kubectl describe envoyproxy openobserve-ai-gateway-proxy -n envoy-gateway-systemIncrease buffer limits in traffic-policies.yaml
kubectl logs -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=ai-gateway -c ai-gateway-extproc --tail=100If you encounter certificate-related issues with Gateway API, ensure cert-manager has Gateway API support enabled:
# Check if cert-manager has Gateway API support
kubectl get deployment cert-manager -n cert-manager -o yaml | grep -i gateway
# Enable Gateway API support in cert-manager
kubectl patch deployment cert-manager -n cert-manager --type='json' \
-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-gateway-api"}]'
# Wait for cert-manager to restart
kubectl rollout status deployment cert-manager -n cert-manager --timeout=60s
# Verify the flag was added
kubectl get deployment cert-manager -n cert-manager -o yaml | grep "enable-gateway-api"Note: If the patch fails with "already exists" or similar error, Gateway API support may already be enabled or cert-manager may use a different configuration method.
Using Kustomize:
kubectl delete -k base/Manual Removal:
# Remove AI Gateway resources
kubectl delete gateway --all -n envoy-gateway-system
kubectl delete envoyproxy --all -n envoy-gateway-system
kubectl delete aiservicebackend --all -n envoy-gateway-system
kubectl delete gatewayclass eg
# Uninstall AI Gateway Helm releases
helm uninstall aieg -n envoy-ai-gateway-system
helm uninstall aieg-crd -n envoy-ai-gateway-system
# Uninstall Envoy Gateway
helm uninstall eg -n envoy-gateway-system
# Delete namespaces (optional)
kubectl delete namespace envoy-ai-gateway-system
kubectl delete namespace envoy-gateway-system- Architecture - Detailed multi-cloud architecture and design (if exists)
- SSL/TLS Certificates Comparison - AWS ACM vs Azure certificate options (if exists)
For issues, consult:
- Envoy Gateway docs: https://gateway.envoyproxy.io/
- Envoy AI Gateway docs: https://aigateway.envoyproxy.io/
- OpenObserve docs: https://openobserve.ai/docs/