Skip to content

Commit 9fba823

Browse files
Add Terraform files to deploy Envoy RateLimiter (apache#37285)
* Add Terraform files to deploy Envoy RateLimiter * fix variables * Add nat creation command to readme * add hpa for memory * fix redability comments * fix comments * add license * Update examples/terraform/envoy-ratelimiter/README.md Co-authored-by: Danny McCormick <dannymccormick@google.com> --------- Co-authored-by: Danny McCormick <dannymccormick@google.com>
1 parent 82ebcb2 commit 9fba823

File tree

9 files changed

+918
-0
lines changed

9 files changed

+918
-0
lines changed
Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
<!--
2+
Licensed to the Apache Software Foundation (ASF) under one
3+
or more contributor license agreements. See the NOTICE file
4+
distributed with this work for additional information
5+
regarding copyright ownership. The ASF licenses this file
6+
to you under the Apache License, Version 2.0 (the
7+
"License"); you may not use this file except in compliance
8+
with the License. You may obtain a copy of the License at
9+
10+
http://www.apache.org/licenses/LICENSE-2.0
11+
12+
Unless required by applicable law or agreed to in writing,
13+
software distributed under the License is distributed on an
14+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
KIND, either express or implied. See the License for the
16+
specific language governing permissions and limitations
17+
under the License.
18+
-->
19+
20+
# Envoy Rate Limiter on GKE (Terraform)
21+
This directory contains a production-ready Terraform module to deploy a scalable **Envoy Rate Limit Service** on Google Kubernetes Engine (GKE) Autopilot.
22+
23+
## Overview
24+
Apache Beam pipelines often process data at massive scale, which can easily overwhelm external APIs (e.g., Databases, LLM Inference endpoints, SaaS APIs).
25+
26+
This Terraform module deploys a **centralized Rate Limit Service (RLS)** using Envoy. Beam workers can query this service to coordinate global quotas across thousands of distributed workers, ensuring you stay within safe API limits without hitting `429 Too Many Requests` errors.
27+
28+
Example Beam Pipelines using it:
29+
* [Simple DoFn RateLimiter](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/rate_limiter_simple.py)
30+
* [Vertex AI RateLimiter](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/rate_limiter_vertex_ai.py)
31+
32+
## Architectures:
33+
- **GKE Autopilot**: Fully managed, serverless Kubernetes environment.
34+
- **Private Cluster**: Nodes have internal IPs only.
35+
- **Cloud NAT (Prerequisite)**: Allows private nodes to pull Docker images.
36+
- **Envoy Rate Limit Service**: A stateless Go/gRPC service that handles rate limit logic.
37+
- **Redis**: Stores the rate limit counters.
38+
- **StatsD Exporter**: Sidecar container that converts StatsD metrics to Prometheus format, exposed on port `9102`.
39+
- **Internal Load Balancer**: A Google Cloud TCP Load Balancer exposing the Rate Limit service internally within the VPC.
40+
41+
## Prerequisites:
42+
### Following items need to be setup for Envoy Rate Limiter deployment on GCP:
43+
1. [GCP project](https://cloud.google.com/resource-manager/docs/creating-managing-projects)
44+
45+
2. [Tools Installed](https://cloud.google.com/sdk/docs/install):
46+
- [Terraform](https://www.terraform.io/downloads.html) >= 1.0
47+
- [Google Cloud SDK](https://cloud.google.com/sdk/docs/install) (`gcloud`)
48+
- [kubectl](https://kubernetes.io/docs/tasks/tools/)
49+
50+
3. APIs Enabled:
51+
```bash
52+
gcloud services enable container.googleapis.com compute.googleapis.com
53+
```
54+
55+
4. **Network Configuration**:
56+
- **Cloud NAT**: Must exist in the region to allow Private Nodes to pull images and reach external APIs. Follow [this](https://docs.cloud.google.com/nat/docs/gke-example#create-nat) for more details.
57+
**Helper Command** (if you need to create one):
58+
```bash
59+
gcloud compute routers create nat-router --network <VPC_NAME> --region <REGION>
60+
gcloud compute routers nats create nat-config \
61+
--router=nat-router \
62+
--region=<REGION> \
63+
--auto-allocated-nat-external-ips \
64+
--nat-all-subnet-ip-ranges
65+
```
66+
- **Validation via Console**:
67+
1. Go to **Network Services** > **Cloud NAT** in the Google Cloud Console.
68+
2. Verify a NAT Gateway exists for your **Region** and **VPC Network**.
69+
3. Ensure it is configured to apply to **Primary and Secondary ranges** (or at least the ranges GKE will use).
70+
71+
# Prepare deployment configuration:
72+
1. Update the `terraform.tfvars` file to define variables specific to your environment:
73+
74+
* `terraform.tfvars` environment variables:
75+
```
76+
project_id = "my-project-id" # GCP Project ID
77+
region = "us-central1" # GCP Region for deployment
78+
cluster_name = "ratelimit-cluster" # Name of the GKE cluster
79+
deletion_protection = true # Prevent accidental cluster deletion (set "true" for prod)
80+
control_plane_cidr = "172.16.0.0/28" # CIDR for GKE control plane (must not overlap with subnet)
81+
ratelimit_replicas = 1 # Initial number of Rate Limit pods
82+
min_replicas = 1 # Minimum HPA replicas
83+
max_replicas = 5 # Maximum HPA replicas
84+
hpa_cpu_target_percentage = 75 # CPU utilization target for HPA (%)
85+
hpa_memory_target_percentage = 75 # Memory utilization target for HPA (%)
86+
vpc_name = "default" # Existing VPC name to deploy into
87+
subnet_name = "default" # Existing Subnet name (required for Internal LB IP)
88+
ratelimit_image = "envoyproxy/ratelimit:e9ce92cc" # Docker image for Rate Limit service
89+
redis_image = "redis:6.2-alpine" # Docker image for Redis
90+
ratelimit_resources = { requests = { cpu = "100m", memory = "128Mi" }, limits = { cpu = "500m", memory = "512Mi" } }
91+
redis_resources = { requests = { cpu = "250m", memory = "256Mi" }, limits = { cpu = "500m", memory = "512Mi" } }
92+
```
93+
94+
* Custom Rate Limit Configuration (Must override in `terraform.tfvars`):
95+
```
96+
ratelimit_config_yaml = <<EOF
97+
domain: mongo_cps
98+
descriptors:
99+
- key: database
100+
value: users
101+
rate_limit:
102+
unit: second
103+
requests_per_unit: 500
104+
EOF
105+
```
106+
107+
# Deploy Envoy Rate Limiter:
108+
1. Initialize Terraform to download providers and modules:
109+
```bash
110+
terraform init
111+
```
112+
113+
2. Plan and apply the changes:
114+
```bash
115+
terraform plan -out=tfplan
116+
terraform apply tfplan
117+
```
118+
119+
3. Connect to the service:
120+
After deployment, get the **Internal** IP address:
121+
```bash
122+
terraform output load_balancer_ip
123+
```
124+
The service is accessible **only from within the VPC** (e.g., via Dataflow workers or GCE instances in the same network) at `<INTERNAL_IP>:8081`.
125+
126+
4. **Test with Dataflow Workflow**:
127+
Verify connectivity and rate limiting logic by running the example Dataflow pipeline.
128+
129+
```bash
130+
# Get the Internal Load Balancer IP
131+
export RLS_IP=$(terraform output -raw load_balancer_ip)
132+
133+
python sdks/python/apache_beam/examples/rate_limiter_simple.py \
134+
--runner=DataflowRunner \
135+
--project=<YOUR_PROJECT_ID> \
136+
--region=<YOUR_REGION> \
137+
--temp_location=gs://<YOUR_BUCKET>/temp \
138+
--staging_location=gs://<YOUR_BUCKET>/staging \
139+
--job_name=ratelimit-test-$(date +%s) \
140+
# Point to the Terraform-provisioned Internal IP
141+
--rls_address=${RLS_IP}:8081 \
142+
# REQUIRED: Run workers in the same private subnet
143+
--subnetwork=regions/<YOUR_REGION>/subnetworks/<YOUR_SUBNET_NAME> \
144+
--no_use_public_ips
145+
```
146+
147+
148+
# Clean up resources:
149+
To destroy the cluster and all created resources:
150+
```bash
151+
terraform destroy
152+
```
153+
*Note: If `deletion_protection` was enabled, you must set it to `false` in `terraform.tfvars` before destroying.*
154+
155+
# Variables description:
156+
157+
|Variable |Description |Default |
158+
|-----------------------|:----------------------------------------------------|:--------------------------------|
159+
|project_id |**Required** Google Cloud Project ID |- |
160+
|vpc_name |**Required** Existing VPC name to deploy into |- |
161+
|subnet_name |**Required** Existing Subnet name |- |
162+
|ratelimit_config_yaml |**Required** Rate Limit configuration content |- |
163+
|region |GCP Region for deployment |us-central1 |
164+
|control_plane_cidr |CIDR block for GKE control plane |172.16.0.0/28 |
165+
|cluster_name |Name of the GKE cluster |ratelimit-cluster |
166+
|deletion_protection |Prevent accidental cluster deletion |false |
167+
|ratelimit_replicas |Initial number of Rate Limit pods |1 |
168+
|min_replicas |Minimum HPA replicas |1 |
169+
|max_replicas |Maximum HPA replicas |5 |
170+
|hpa_cpu_target_percentage |CPU utilization target for HPA (%) |75 |
171+
|hpa_memory_target_percentage |Memory utilization target for HPA (%) |75 |
172+
|ratelimit_image |Docker image for Rate Limit service |envoyproxy/ratelimit:e9ce92cc |
173+
|redis_image |Docker image for Redis |redis:6.2-alpine |
174+
|ratelimit_resources |Resources for Rate Limit service (map) |requests/limits (CPU/Mem) |
175+
|redis_resources |Resources for Redis container (map) |requests/limits (CPU/Mem) |
176+
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
// Provision the Kubernetes cluster.
20+
resource "google_container_cluster" "primary" {
21+
name = var.cluster_name
22+
location = var.region
23+
24+
enable_autopilot = true
25+
deletion_protection = var.deletion_protection
26+
27+
network = data.google_compute_network.default.id
28+
subnetwork = data.google_compute_subnetwork.default.id
29+
30+
ip_allocation_policy {}
31+
32+
# Private Cluster Configuration
33+
private_cluster_config {
34+
enable_private_nodes = true # Nodes have internal IPs only
35+
enable_private_endpoint = false # Master is accessible via Public IP (required for Terraform from outside VPC)
36+
master_ipv4_cidr_block = var.control_plane_cidr
37+
}
38+
}
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
20+
resource "google_compute_address" "ratelimit_ip" {
21+
name = var.ip_name != "" ? var.ip_name : "${var.cluster_name}-ratelimit-ip"
22+
region = var.region
23+
address_type = "INTERNAL"
24+
subnetwork = data.google_compute_subnetwork.default.id
25+
}
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
output "cluster_name" {
20+
description = "The name of the GKE cluster."
21+
value = google_container_cluster.primary.name
22+
}
23+
24+
output "load_balancer_ip" {
25+
description = "The IP address of the load balancer."
26+
value = google_compute_address.ratelimit_ip.address
27+
}
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
resource "google_project_service" "required" {
20+
for_each = toset([
21+
"container",
22+
"iam",
23+
"compute",
24+
])
25+
26+
service = "${each.key}.googleapis.com"
27+
disable_on_destroy = false
28+
}
29+
30+
// Query the VPC network to make sure it exists.
31+
data "google_compute_network" "default" {
32+
name = var.vpc_name
33+
depends_on = [google_project_service.required]
34+
}
35+
36+
// Query the VPC subnetwork to make sure it exists in the region specified.
37+
data "google_compute_subnetwork" "default" {
38+
name = var.subnet_name
39+
region = var.region
40+
depends_on = [google_project_service.required]
41+
lifecycle {
42+
postcondition {
43+
condition = self.private_ip_google_access
44+
error_message = <<EOT
45+
fatal: ${self.id} misconfigured: private Google access disabled.
46+
See https://cloud.google.com/vpc/docs/configure-private-google-access for details.
47+
EOT
48+
}
49+
}
50+
}
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
terraform {
20+
required_providers {
21+
google = {
22+
source = "hashicorp/google"
23+
version = "~> 5.0"
24+
}
25+
kubernetes = {
26+
source = "hashicorp/kubernetes"
27+
version = "~> 2.0"
28+
}
29+
}
30+
}
31+
32+
provider "google" {
33+
project = var.project_id
34+
region = var.region
35+
}
36+
37+
# Configure kubernetes provider to use the GKE cluster
38+
provider "kubernetes" {
39+
host = "https://${google_container_cluster.primary.endpoint}"
40+
token = data.google_client_config.default.access_token
41+
cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth[0].cluster_ca_certificate)
42+
}
43+
44+
data "google_client_config" "default" {}

0 commit comments

Comments
 (0)