Skip to content

openobserve/o2-envoy-gateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Envoy Gateway - Multi-Cloud Deployment Guide

This directory contains deployment resources for Envoy Gateway (as a modern replacement for nginx ingress) and AI Gateway extension to enable AI capabilities for OpenObserve SRE Agent. Envoy Gateway provides HTTP routing, load balancing, and traffic management, while AI Gateway adds LLM provider integration and AI-specific features.

🌐 Multi-Cloud Support

This repository supports AWS EKS and Azure AKS deployments. Cloud-specific load balancer configurations are included as commented annotations in the base files - simply uncomment the annotations for your cloud provider before deploying.

Directory Structure

.
├── README.md
├── base/                    # Multi-cloud base configurations
│   ├── kustomization.yaml
│   ├── gatewayclass.yaml
│   ├── gateway.yaml
│   ├── gateway-config.yaml   # Includes commented AWS Bedrock IRSA config
│   ├── envoyproxy.yaml       # Includes commented AWS & Azure annotations
│   ├── traffic-policies.yaml
│   ├── letsencrypt-issuer.yaml
│   ├── aiProviders/          # LLM provider configurations
│   │   ├── providers/        # Individual provider configs
│   │   ├── cloud-providers.yaml
│   │   ├── major-providers.yaml
│   │   └── alternative-providers.yaml
│   └── policies/             # Traffic and security policies
├── examples/
│   ├── aws-bedrock-irsa.yaml
│   └── gcp-vertex-ai.yaml
├── monitoring/               # Monitoring configs
└── docs/                     # Additional documentation

Prerequisites

  1. Kubernetes cluster with kubectl access
    • AWS EKS cluster for AWS deployment
    • Azure AKS cluster for Azure deployment
  2. Envoy Gateway operator installed (see Installation section)
  3. Envoy AI Gateway CRDs installed
  4. Namespace created (e.g., envoy-gateway-system)
  5. kubectl installed

Installation Steps

1. Deploy Envoy Gateway and AI Gateway

Basic Installation (Recommended for Getting Started)

# Step 1: Install Envoy Gateway (includes Envoy CRDs)
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
  --version v0.0.0-latest \
  --namespace envoy-gateway-system \
  --create-namespace \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml

# Step 2: Wait for Envoy Gateway to be ready
kubectl wait --timeout=2m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available

# Step 3: Install AI Gateway CRDs
helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm \
  --version v0.0.0-latest \
  --namespace envoy-ai-gateway-system \
  --create-namespace

# Step 4: Install AI Gateway Resources
helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \
  --version v0.0.0-latest \
  --namespace envoy-ai-gateway-system \
  --create-namespace

# Step 5: Wait for AI Gateway controller to be ready
kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available

Advanced Installation (With Rate Limiting and InferencePool)

# Install Envoy Gateway with additional features
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
  --version v0.0.0-latest \
  --namespace envoy-gateway-system \
  --create-namespace \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/token_ratelimit/envoy-gateway-values-addon.yaml \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/inference-pool/envoy-gateway-values-addon.yaml

# Then continue with steps 2-5 from Basic Installation

Verify Installation

# Check Envoy Gateway status
kubectl get deployments -n envoy-gateway-system

# Check AI Gateway status
kubectl get deployments -n envoy-ai-gateway-system

# Check all CRDs are installed
kubectl get crds | grep -E "(envoyproxy|aigateway)"

2. Configure for Your Cloud Provider

Before deploying, edit base/envoyproxy.yaml to uncomment the annotations for your cloud provider:

🔷 AWS EKS Configuration

  1. Open base/envoyproxy.yaml
  2. Find the # ==================== AWS EKS Configuration ==================== sections
  3. Uncomment the AWS annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"  # for external
  4. (Optional) For SSL/TLS termination, uncomment ACM certificate annotations
  5. (Optional) For AWS Bedrock IRSA, uncomment the configuration in base/gateway-config.yaml

🔵 Azure AKS Configuration

⚠️ IMPORTANT: Before deploying internal load balancers on Azure, grant the AKS managed identity Network Contributor permissions on your VNet:

# Set your values
RESOURCE_GROUP="your-resource-group"
VNET_NAME="your-vnet-name"
AKS_CLUSTER="your-aks-cluster-name"

# Get the AKS managed identity
MANAGED_IDENTITY=$(az aks show \
    --resource-group $RESOURCE_GROUP \
    --name $AKS_CLUSTER \
    --query identityProfile.kubeletidentity.clientId -o tsv)

# Get your subscription ID
SUBSCRIPTION_ID=$(az account show --query id -o tsv)

# Grant Network Contributor role on the VNet
az role assignment create \
    --assignee $MANAGED_IDENTITY \
    --role "Network Contributor" \
    --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Network/virtualNetworks/$VNET_NAME"

Then configure the manifests:

  1. Open base/envoyproxy.yaml
  2. Find the # ==================== Azure AKS Configuration ==================== sections
  3. Uncomment the Azure annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"  # for internal
  4. (Optional) For static IPs or DNS labels, uncomment those annotations
  5. (Note) Azure Load Balancer does NOT support SSL/TLS termination - terminate SSL in Envoy or use Application Gateway

3. Deploy AI Gateway Resources

Option 1: Deploy with Kustomize (Recommended)

# Deploy all resources
kubectl apply -k base/

# Verify deployment
kubectl get gateway,envoyproxy,aiservicebackend -n envoy-gateway-system

Option 2: Deploy Manually

# Step 1: Core Gateway components
kubectl apply -f base/gatewayclass.yaml
kubectl apply -f base/envoyproxy.yaml -n envoy-gateway-system
kubectl apply -f base/gateway-config.yaml -n envoy-gateway-system
kubectl apply -f base/gateway.yaml -n envoy-gateway-system

# Step 2: Deploy LLM provider backends
kubectl apply -f base/aiProviders/major-providers.yaml -n envoy-gateway-system

# Step 3: Traffic policies (optional but recommended)
kubectl apply -f base/traffic-policies.yaml -n envoy-gateway-system

# Step 4: TLS/SSL certificate (optional, for HTTPS)
kubectl apply -f base/letsencrypt-issuer.yaml -n envoy-gateway-system

4. Verify Deployment

# Check Gateway status
kubectl get gateway -n envoy-gateway-system

# Check EnvoyProxy resources
kubectl get envoyproxy -n envoy-gateway-system

# Check if Envoy pods are created
kubectl get pods -n envoy-gateway-system

# Check backends
kubectl get aiservicebackend -n envoy-gateway-system

# Check services and load balancers
kubectl get svc -n envoy-gateway-system

5. Get Load Balancer Endpoints

For AWS EKS:

# Get internal load balancer endpoint
kubectl get svc -n envoy-gateway-system envoy-eg-internal-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

# Get external load balancer endpoint
kubectl get svc -n envoy-gateway-system envoy-eg-external-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

For Azure AKS:

# Get internal load balancer IP
kubectl get svc -n envoy-gateway-system envoy-eg-internal-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

# Get external load balancer IP
kubectl get svc -n envoy-gateway-system envoy-eg-external-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

6. Configure SRE Agent

Update your SRE agent deployment with:

env:
  - name: O2_AI_GATEWAY_ENABLED
    value: "true"
  - name: O2_AI_GATEWAY_URL
    value: "http://ai-gateway.envoy-gateway-system.svc.cluster.local:80"
  - name: O2_AI_MODEL
    value: "gpt-4o"  # or any supported model
  - name: O2_AI_API_KEY
    valueFrom:
      secretKeyRef:
        name: openobserve-ai-key
        key: apiKey

Supported LLM Providers

Provider Model Example Authentication
OpenAI gpt-4o API Key
Anthropic claude-sonnet-4-5-20250929 API Key
AWS Bedrock anthropic.claude-3-sonnet-* IRSA (IAM Role for Service Accounts)
DeepSeek deepseek-chat API Key
Grok (xAI) grok-4-1-fast-reasoning API Key
Azure OpenAI azure-gpt-4 API Key or Azure AD Service Principal
Gemini (text) gemini-3-flash-preview API Key
Google Vertex AI Any Gemini model GCP Service Account JSON

Cloud Provider Comparison

Feature AWS EKS Azure AKS
Load Balancer Type Network Load Balancer (NLB) Azure Load Balancer (Standard SKU)
Internal LB Annotation aws-load-balancer-internal: "true" azure-load-balancer-internal: "true"
LB Address Type DNS hostname IP address
SSL/TLS Termination ACM certificates Azure Key Vault or cert-manager
Cloud AI Service AWS Bedrock with IRSA Azure OpenAI with Managed Identity
Identity Management IAM Roles for Service Accounts (IRSA) Workload Identity
Cross-zone/region Cross-zone load balancing Zone redundancy (automatic)

Customization

Change Namespace

Update the namespace in the kustomization file:

# Edit base/kustomization.yaml
sed -i 's/namespace: envoy-gateway-system/namespace: your-namespace/g' base/kustomization.yaml

Cloud Provider Annotations

All cloud-specific load balancer annotations are included as comments in base/envoyproxy.yaml. Simply uncomment the annotations you need for your cloud provider:

  • AWS EKS: NLB annotations, ACM certificates, cross-zone load balancing
  • Azure AKS: Azure Load Balancer annotations, static IPs, DNS labels

Troubleshooting

Gateway not creating pods

Check EnvoyProxy status:

kubectl describe envoyproxy openobserve-ai-gateway-proxy -n envoy-gateway-system

413 Payload Too Large errors

Increase buffer limits in traffic-policies.yaml

Check AI Gateway logs

kubectl logs -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=ai-gateway -c ai-gateway-extproc --tail=100

Certificate Issues with Gateway API

If you encounter certificate-related issues with Gateway API, ensure cert-manager has Gateway API support enabled:

# Check if cert-manager has Gateway API support
kubectl get deployment cert-manager -n cert-manager -o yaml | grep -i gateway

# Enable Gateway API support in cert-manager
kubectl patch deployment cert-manager -n cert-manager --type='json' \
  -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-gateway-api"}]'

# Wait for cert-manager to restart
kubectl rollout status deployment cert-manager -n cert-manager --timeout=60s

# Verify the flag was added
kubectl get deployment cert-manager -n cert-manager -o yaml | grep "enable-gateway-api"

Note: If the patch fails with "already exists" or similar error, Gateway API support may already be enabled or cert-manager may use a different configuration method.

Uninstallation

Remove Gateway Resources

Using Kustomize:

kubectl delete -k base/

Manual Removal:

# Remove AI Gateway resources
kubectl delete gateway --all -n envoy-gateway-system
kubectl delete envoyproxy --all -n envoy-gateway-system
kubectl delete aiservicebackend --all -n envoy-gateway-system
kubectl delete gatewayclass eg

# Uninstall AI Gateway Helm releases
helm uninstall aieg -n envoy-ai-gateway-system
helm uninstall aieg-crd -n envoy-ai-gateway-system

# Uninstall Envoy Gateway
helm uninstall eg -n envoy-gateway-system

# Delete namespaces (optional)
kubectl delete namespace envoy-ai-gateway-system
kubectl delete namespace envoy-gateway-system

Documentation

Support

For issues, consult:

About

Envoy Gateway configuration for OpenObserve - AI traffic routing, rate limiting, and observability integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors