Skip to content

Conversation

@yuzisun
Copy link
Contributor

@yuzisun yuzisun commented Jan 10, 2026

Implement the quota mode soft limit feature for rate limiting service.

  • Added quota_mode: true field to descriptors (similar to shadow_mode)
  • Core Rate Limiting Logic: When a quota mode limit is exceeded, Request is allowed (returns 200 OK, not 429), Metadata is set indicating which quotas were exceeded.
  • Added quotaModeViolations: Array of descriptor indices that exceeded quota limits, Added quotaModeEnabled: Boolean indicating if any limits have quota mode enabled.

Use Cases:

  1. 📊 Usage Tracking: Monitor customer usage without blocking requests
  2. 📈 Analytics: Collect violation patterns for capacity planning
  3. 🚨 Alerting: Trigger alerts when soft limits are reached
  4. 🎛️ Graceful Degradation: Allow services to fallback to another backend when one backend quota limit is reached

example test:

curl -X POST http://localhost:8080/json -H "Content-Type: application/json" -d '{"domain": "test_domain", "descriptors": 
[{"entries": [{"key": "quota_limit", "value": "test"}]}]}'
{"overallCode":"OK", "statuses":[{"code":"OK", "currentLimit":{"requestsPerUnit":3, "unit":"MINUTE"}}], 
"dynamicMetadata":{"descriptors":[{"entries":["quota_limit=test"]}], "domain":"test_domain", "quotaModeEnabled":true, "quotaModeViolations":[0]}}

@yuzisun yuzisun changed the title implement quota mode for rate limit check implement quota mode for soft rate limit check Jan 10, 2026
Signed-off-by: Dan Sun <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
@yanavlasov
Copy link

Looks good to me. Thanks.

@collin-lee will wait for your review as well, since you were reviewing latest PRs. Note this change is to support Quota for Envoy AI Gateway envoyproxy/ai-gateway#1571

The difference from the existing rate limit logic, is that in quota mode request is allowed if any of the descriptors have quota.

Quota mode would be enabled at the domain level. I.e. all descriptors under a specific domain will either be in quota or rate limit mode.

/wait-any

@collin-lee collin-lee merged commit a28b84d into envoyproxy:main Jan 15, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants