Envoy Gateway - Multi-Cloud Deployment Guide

This directory contains deployment resources for Envoy Gateway (as a modern replacement for nginx ingress) and AI Gateway extension to enable AI capabilities for OpenObserve SRE Agent. Envoy Gateway provides HTTP routing, load balancing, and traffic management, while AI Gateway adds LLM provider integration and AI-specific features.

🌐 Multi-Cloud Support

This repository supports AWS EKS and Azure AKS deployments. Cloud-specific load balancer configurations are included as commented annotations in the base files - simply uncomment the annotations for your cloud provider before deploying.

Directory Structure

.
├── README.md
├── base/                    # Multi-cloud base configurations
│   ├── kustomization.yaml
│   ├── gatewayclass.yaml
│   ├── gateway.yaml
│   ├── gateway-config.yaml   # Includes commented AWS Bedrock IRSA config
│   ├── envoyproxy.yaml       # Includes commented AWS & Azure annotations
│   ├── traffic-policies.yaml
│   ├── letsencrypt-issuer.yaml
│   ├── aiProviders/          # LLM provider configurations
│   │   ├── providers/        # Individual provider configs
│   │   ├── cloud-providers.yaml
│   │   ├── major-providers.yaml
│   │   └── alternative-providers.yaml
│   └── policies/             # Traffic and security policies
├── examples/
│   ├── aws-bedrock-irsa.yaml
│   └── gcp-vertex-ai.yaml
├── monitoring/               # Monitoring configs
└── docs/                     # Additional documentation

Prerequisites

Kubernetes cluster with kubectl access
- AWS EKS cluster for AWS deployment
- Azure AKS cluster for Azure deployment
Envoy Gateway operator installed (see Installation section)
Envoy AI Gateway CRDs installed
Namespace created (e.g., envoy-gateway-system)
kubectl installed

Installation Steps

1. Deploy Envoy Gateway and AI Gateway

Basic Installation (Recommended for Getting Started)

# Step 1: Install Envoy Gateway (includes Envoy CRDs)
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
  --version v0.0.0-latest \
  --namespace envoy-gateway-system \
  --create-namespace \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml

# Step 2: Wait for Envoy Gateway to be ready
kubectl wait --timeout=2m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available

# Step 3: Install AI Gateway CRDs
helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm \
  --version v0.0.0-latest \
  --namespace envoy-ai-gateway-system \
  --create-namespace

# Step 4: Install AI Gateway Resources
helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \
  --version v0.0.0-latest \
  --namespace envoy-ai-gateway-system \
  --create-namespace

# Step 5: Wait for AI Gateway controller to be ready
kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available

Advanced Installation (With Rate Limiting and InferencePool)

# Install Envoy Gateway with additional features
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
  --version v0.0.0-latest \
  --namespace envoy-gateway-system \
  --create-namespace \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/token_ratelimit/envoy-gateway-values-addon.yaml \
  -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/inference-pool/envoy-gateway-values-addon.yaml

# Then continue with steps 2-5 from Basic Installation

Verify Installation

# Check Envoy Gateway status
kubectl get deployments -n envoy-gateway-system

# Check AI Gateway status
kubectl get deployments -n envoy-ai-gateway-system

# Check all CRDs are installed
kubectl get crds | grep -E "(envoyproxy|aigateway)"

2. Configure for Your Cloud Provider

Before deploying, edit base/envoyproxy.yaml to uncomment the annotations for your cloud provider:

🔷 AWS EKS Configuration

Open base/envoyproxy.yaml
Find the # ==================== AWS EKS Configuration ==================== sections

Uncomment the AWS annotations:

service.beta.kubernetes.io/aws-load-balancer-internal: "true"
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"  # for external

(Optional) For SSL/TLS termination, uncomment ACM certificate annotations
(Optional) For AWS Bedrock IRSA, uncomment the configuration in base/gateway-config.yaml

🔵 Azure AKS Configuration

⚠️ IMPORTANT: Before deploying internal load balancers on Azure, grant the AKS managed identity Network Contributor permissions on your VNet:

# Set your values
RESOURCE_GROUP="your-resource-group"
VNET_NAME="your-vnet-name"
AKS_CLUSTER="your-aks-cluster-name"

# Get the AKS managed identity
MANAGED_IDENTITY=$(az aks show \
    --resource-group $RESOURCE_GROUP \
    --name $AKS_CLUSTER \
    --query identityProfile.kubeletidentity.clientId -o tsv)

# Get your subscription ID
SUBSCRIPTION_ID=$(az account show --query id -o tsv)

# Grant Network Contributor role on the VNet
az role assignment create \
    --assignee $MANAGED_IDENTITY \
    --role "Network Contributor" \
    --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Network/virtualNetworks/$VNET_NAME"

Then configure the manifests:

Open base/envoyproxy.yaml
Find the # ==================== Azure AKS Configuration ==================== sections

Uncomment the Azure annotations:

service.beta.kubernetes.io/azure-load-balancer-internal: "true"  # for internal

(Optional) For static IPs or DNS labels, uncomment those annotations
(Note) Azure Load Balancer does NOT support SSL/TLS termination - terminate SSL in Envoy or use Application Gateway

3. Deploy AI Gateway Resources

Option 1: Deploy with Kustomize (Recommended)

# Deploy all resources
kubectl apply -k base/

# Verify deployment
kubectl get gateway,envoyproxy,aiservicebackend -n envoy-gateway-system

Option 2: Deploy Manually

# Step 1: Core Gateway components
kubectl apply -f base/gatewayclass.yaml
kubectl apply -f base/envoyproxy.yaml -n envoy-gateway-system
kubectl apply -f base/gateway-config.yaml -n envoy-gateway-system
kubectl apply -f base/gateway.yaml -n envoy-gateway-system

# Step 2: Deploy LLM provider backends
kubectl apply -f base/aiProviders/major-providers.yaml -n envoy-gateway-system

# Step 3: Traffic policies (optional but recommended)
kubectl apply -f base/traffic-policies.yaml -n envoy-gateway-system

# Step 4: TLS/SSL certificate (optional, for HTTPS)
kubectl apply -f base/letsencrypt-issuer.yaml -n envoy-gateway-system

4. Verify Deployment

# Check Gateway status
kubectl get gateway -n envoy-gateway-system

# Check EnvoyProxy resources
kubectl get envoyproxy -n envoy-gateway-system

# Check if Envoy pods are created
kubectl get pods -n envoy-gateway-system

# Check backends
kubectl get aiservicebackend -n envoy-gateway-system

# Check services and load balancers
kubectl get svc -n envoy-gateway-system

5. Get Load Balancer Endpoints

For AWS EKS:

# Get internal load balancer endpoint
kubectl get svc -n envoy-gateway-system envoy-eg-internal-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

# Get external load balancer endpoint
kubectl get svc -n envoy-gateway-system envoy-eg-external-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

For Azure AKS:

# Get internal load balancer IP
kubectl get svc -n envoy-gateway-system envoy-eg-internal-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

# Get external load balancer IP
kubectl get svc -n envoy-gateway-system envoy-eg-external-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

6. Configure SRE Agent

Update your SRE agent deployment with:

env:
  - name: O2_AI_GATEWAY_ENABLED
    value: "true"
  - name: O2_AI_GATEWAY_URL
    value: "http://ai-gateway.envoy-gateway-system.svc.cluster.local:80"
  - name: O2_AI_MODEL
    value: "gpt-4o"  # or any supported model
  - name: O2_AI_API_KEY
    valueFrom:
      secretKeyRef:
        name: openobserve-ai-key
        key: apiKey

Supported LLM Providers

Provider	Model Example	Authentication
OpenAI	`gpt-4o`	API Key
Anthropic	`claude-sonnet-4-5-20250929`	API Key
AWS Bedrock	`anthropic.claude-3-sonnet-*`	IRSA (IAM Role for Service Accounts)
DeepSeek	`deepseek-chat`	API Key
Grok (xAI)	`grok-4-1-fast-reasoning`	API Key
Azure OpenAI	`azure-gpt-4`	API Key or Azure AD Service Principal
Gemini (text)	`gemini-3-flash-preview`	API Key
Google Vertex AI	Any Gemini model	GCP Service Account JSON

Cloud Provider Comparison

Feature	AWS EKS	Azure AKS
Load Balancer Type	Network Load Balancer (NLB)	Azure Load Balancer (Standard SKU)
Internal LB Annotation	`aws-load-balancer-internal: "true"`	`azure-load-balancer-internal: "true"`
LB Address Type	DNS hostname	IP address
SSL/TLS Termination	ACM certificates	Azure Key Vault or cert-manager
Cloud AI Service	AWS Bedrock with IRSA	Azure OpenAI with Managed Identity
Identity Management	IAM Roles for Service Accounts (IRSA)	Workload Identity
Cross-zone/region	Cross-zone load balancing	Zone redundancy (automatic)

Customization

Change Namespace

Update the namespace in the kustomization file:

# Edit base/kustomization.yaml
sed -i 's/namespace: envoy-gateway-system/namespace: your-namespace/g' base/kustomization.yaml

Cloud Provider Annotations

All cloud-specific load balancer annotations are included as comments in base/envoyproxy.yaml. Simply uncomment the annotations you need for your cloud provider:

AWS EKS: NLB annotations, ACM certificates, cross-zone load balancing
Azure AKS: Azure Load Balancer annotations, static IPs, DNS labels

Troubleshooting

Gateway not creating pods

Check EnvoyProxy status:

kubectl describe envoyproxy openobserve-ai-gateway-proxy -n envoy-gateway-system

413 Payload Too Large errors

Increase buffer limits in traffic-policies.yaml

Check AI Gateway logs

kubectl logs -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=ai-gateway -c ai-gateway-extproc --tail=100

Certificate Issues with Gateway API

If you encounter certificate-related issues with Gateway API, ensure cert-manager has Gateway API support enabled:

# Check if cert-manager has Gateway API support
kubectl get deployment cert-manager -n cert-manager -o yaml | grep -i gateway

# Enable Gateway API support in cert-manager
kubectl patch deployment cert-manager -n cert-manager --type='json' \
  -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-gateway-api"}]'

# Wait for cert-manager to restart
kubectl rollout status deployment cert-manager -n cert-manager --timeout=60s

# Verify the flag was added
kubectl get deployment cert-manager -n cert-manager -o yaml | grep "enable-gateway-api"

Note: If the patch fails with "already exists" or similar error, Gateway API support may already be enabled or cert-manager may use a different configuration method.

Uninstallation

Remove Gateway Resources

Using Kustomize:

kubectl delete -k base/

Manual Removal:

# Remove AI Gateway resources
kubectl delete gateway --all -n envoy-gateway-system
kubectl delete envoyproxy --all -n envoy-gateway-system
kubectl delete aiservicebackend --all -n envoy-gateway-system
kubectl delete gatewayclass eg

# Uninstall AI Gateway Helm releases
helm uninstall aieg -n envoy-ai-gateway-system
helm uninstall aieg-crd -n envoy-ai-gateway-system

# Uninstall Envoy Gateway
helm uninstall eg -n envoy-gateway-system

# Delete namespaces (optional)
kubectl delete namespace envoy-ai-gateway-system
kubectl delete namespace envoy-gateway-system

Documentation

Architecture - Detailed multi-cloud architecture and design (if exists)
SSL/TLS Certificates Comparison - AWS ACM vs Azure certificate options (if exists)

Support

For issues, consult:

Envoy Gateway docs: https://gateway.envoyproxy.io/
Envoy AI Gateway docs: https://aigateway.envoyproxy.io/
OpenObserve docs: https://openobserve.ai/docs/

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
base		base
docs		docs
examples		examples
monitoring		monitoring
CHANGELOG.md		CHANGELOG.md
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Envoy Gateway - Multi-Cloud Deployment Guide

🌐 Multi-Cloud Support

Directory Structure

Prerequisites

Installation Steps

1. Deploy Envoy Gateway and AI Gateway

Basic Installation (Recommended for Getting Started)

Advanced Installation (With Rate Limiting and InferencePool)

Verify Installation

2. Configure for Your Cloud Provider

🔷 AWS EKS Configuration

🔵 Azure AKS Configuration

3. Deploy AI Gateway Resources

Option 1: Deploy with Kustomize (Recommended)

Option 2: Deploy Manually

4. Verify Deployment

5. Get Load Balancer Endpoints

6. Configure SRE Agent

Supported LLM Providers

Cloud Provider Comparison

Customization

Change Namespace

Cloud Provider Annotations

Troubleshooting

Gateway not creating pods

413 Payload Too Large errors

Check AI Gateway logs

Certificate Issues with Gateway API

Uninstallation

Remove Gateway Resources

Documentation

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages