Add IP-based rate limiting to protect against abuse #833

tadasant · 2025-12-09T15:21:31Z

Written by Claude Code, reviewed by me. When we get this in main I'll test it out against staging to make sure it's acting as intended.

Summary

Implements IP-based rate limiting middleware to address service disruptions caused by excessive requests (Implement rate limits to address mis-use causing downtime #826)
Enforces dual rate limits: 60 requests/minute and 1000 requests/hour per IP address
Fully configurable via environment variables (MCP_REGISTRY_RATE_LIMIT_*)
Excludes health/ping/metrics endpoints from rate limiting
Returns RFC 7807 problem+json responses with Retry-After header when rate limited

Configuration Options

Variable	Default	Description
`MCP_REGISTRY_RATE_LIMIT_ENABLED`	`true`	Enable/disable rate limiting
`MCP_REGISTRY_RATE_LIMIT_REQUESTS_PER_MINUTE`	`60`	Max requests per minute per IP
`MCP_REGISTRY_RATE_LIMIT_REQUESTS_PER_HOUR`	`1000`	Max requests per hour per IP

Implementation Details

Uses token bucket algorithm via golang.org/x/time/rate (already a dependency)
Properly handles X-Forwarded-For and X-Real-IP headers (since registry is deployed behind NGINX ingress with use-forwarded-headers=true)
All IPs are validated using net.ParseIP() to handle malformed input
Memory protection via MaxVisitors limit (100k) with LRU eviction
Background cleanup of stale visitor entries every 10 minutes

Test plan

Known Limitations

Multi-pod deployment: In-memory rate limiting means each pod has independent limits. For strict per-IP limits across the cluster, consider Redis-based rate limiting in a future iteration.

Closes #826

🤖 Generated with Claude Code

Implements rate limiting middleware to address service disruptions caused by excessive requests. The rate limiter enforces per-IP limits of 60 requests per minute and 1000 requests per hour (configurable via environment variables). Features: - Dual rate limiting (per-minute and per-hour) using token bucket algorithm - Proper handling of X-Forwarded-For and X-Real-IP headers for proxied requests - Skip paths for health, ping, and metrics endpoints - Background cleanup of stale visitor entries - Configurable via MCP_REGISTRY_RATE_LIMIT_* environment variables - Returns RFC 7807 problem+json responses with Retry-After header Closes #826 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Add TrustProxy config option (defaults to false for security) - When false, only RemoteAddr is used (prevents IP spoofing) - When true, X-Forwarded-For and X-Real-IP headers are trusted - Add IP validation using net.ParseIP() to handle invalid/malformed IPs - Add MaxVisitors config to prevent memory exhaustion attacks - Implement LRU-style eviction when MaxVisitors limit is reached - Optimize lock granularity with read locks for existing visitors - Add comprehensive tests for: - TrustProxy enabled/disabled behavior - IP spoofing prevention - Invalid IP handling - IPv6 support - Memory limit enforcement - Concurrent access 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Since the registry is always deployed behind NGINX ingress with use-forwarded-headers=true, there's no need for a TrustProxy config option. The NGINX ingress is trusted infrastructure that correctly sets X-Forwarded-For headers. Removed: - TrustProxy config option and environment variable - Tests for TrustProxy enabled/disabled behavior The implementation now always checks X-Forwarded-For/X-Real-IP headers first, then falls back to RemoteAddr. This matches the deployment setup. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add mcp_registry.http.rate_limited counter to track when requests are blocked by rate limiting. This enables monitoring rate limit activity over time via Prometheus. - Add RateLimitedRequests counter to telemetry.Metrics - Add OnRateLimited callback to rate limiter config - Wire up metrics in server initialization - Add test for callback behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Add rate limit section to official-registry-api.md documenting the 429 response format, limits, and client guidance - Add note in .env.example about per-pod rate limit behavior - Add code comment in server.go explaining multi-replica approximation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

pree-dew · 2025-12-10T08:09:27Z

internal/api/ratelimit/ratelimit.go

+		return v
+	}
+
+	// Enforce max visitors limit (memory protection)


@tadasant if there are request coming from unique ips then it will iterate over all exisiting entries to find 1 ip then remove while occupying lock. After 100K, every request will take good amount of time.

Please feel free to correct me if I am missing something here.

We can also implement LRU based structure here. or may be remove some % of entries if the limit is hit instead of removing 1 on every new request.

I will be more than happy if I can help in anyway here.

pree-dew · 2025-12-10T08:18:33Z

internal/api/ratelimit/ratelimit.go

+	}
+
+	// Need to create new visitor - acquire write lock
+	rl.mu.Lock()


this might be very slow when throughput is high, as per the stats on grafana over per hour throughput is always above 15K and has reached max upto ~35K. Cleanup won't happen till the time an entry is not older than an hour.

may be shard based rate limiter will suit better if we are expecting per min throughput to increase.

rdimitrov

I think we would want to solve this in a different way and handle this in the infrastructure layer rather than inside the registry 👍

Avish34 · 2025-12-11T17:28:14Z

+1 to the @rdimitrov 's suggestion. Implementing rate limiting in the registry might slow down the system when we hit scale. Given the fact that we are having NGNIX as ingress controller in our infra we might as well leverage that to do the rate limiting.
Reference: https://www.civo.com/learn/rate-limiting-applications-with-nginx-ingress

domdomegg · 2025-12-11T22:33:55Z

.env.example

 MCP_REGISTRY_OIDC_EDIT_PERMISSIONS=*
 MCP_REGISTRY_OIDC_PUBLISH_PERMISSIONS=*
+
+# Rate limiting configuration


I might not bother making this configurable, and instead just have this baked in. Keeps the number of knobs and dials low :)

domdomegg · 2025-12-11T22:34:40Z

internal/api/ratelimit/ratelimit.go

@@ -0,0 +1,297 @@
+// Package ratelimit provides IP-based rate limiting middleware for HTTP servers.


Is golang the right place to be doing this - feel like maybe should be done on the Kubernetes ingress?

domdomegg · 2025-12-11T22:36:42Z

.env.example

+# Enable or disable rate limiting (default: true)
+MCP_REGISTRY_RATE_LIMIT_ENABLED=true
+# Maximum requests per minute per IP address (default: 60)
+MCP_REGISTRY_RATE_LIMIT_REQUESTS_PER_MINUTE=60


I'd be tempted to push this a bit higher (without changing the per hour limit) to allow for slightly spiky/bursty traffic e.g. for people doing local analysis. E.g. maybe 200 per minute?

domdomegg

Seems broadly sensible - plausibly we should do this at infra layer though e.g. nginx ingress

tadasant · 2025-12-12T16:36:43Z

Closing in favor of #843

## Motivation and Context  The following is an alternative solution to #833 Goal is to deploy and promote it to prod, then monitor the alerts for a week to see its effect and reassess. In case it's actual traffic that we've started blocking we can also up the replicas. I thought it doesn't make sense to up them now since it would be nice to isolate the result of this change first. ## How Has This Been Tested?  ## Breaking Changes  ## Types of changes  - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update ## Checklist  - [ ] I have read the [MCP Documentation](https://modelcontextprotocol.io) - [ ] My code follows the repository's style guidelines - [ ] New and existing tests pass locally - [ ] I have added appropriate error handling - [ ] I have added or updated documentation as needed ## Additional context  --------- Signed-off-by: Radoslav Dimitrov <[email protected]>

tadasant and others added 5 commits December 9, 2025 07:20

tadasant marked this pull request as ready for review December 9, 2025 19:10

tadasant requested a review from a team December 9, 2025 19:10

pree-dew reviewed Dec 10, 2025

View reviewed changes

rdimitrov requested changes Dec 10, 2025

View reviewed changes

domdomegg reviewed Dec 11, 2025

View reviewed changes

domdomegg approved these changes Dec 11, 2025

View reviewed changes

rdimitrov mentioned this pull request Dec 11, 2025

Add NGINX ingress rate limiting to protect against traffic spikes #843

Merged

9 tasks

tadasant closed this Dec 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add IP-based rate limiting to protect against abuse #833

Add IP-based rate limiting to protect against abuse #833

tadasant commented Dec 9, 2025 •

edited

Loading

Uh oh!

pree-dew Dec 10, 2025 •

edited

Loading

Uh oh!

pree-dew Dec 10, 2025

Uh oh!

pree-dew Dec 10, 2025

Uh oh!

pree-dew Dec 10, 2025

Uh oh!

pree-dew Dec 10, 2025

Uh oh!

rdimitrov left a comment

Uh oh!

Avish34 commented Dec 11, 2025

Uh oh!

domdomegg Dec 11, 2025

Uh oh!

domdomegg Dec 11, 2025

Uh oh!

domdomegg Dec 11, 2025

Uh oh!

domdomegg left a comment

Uh oh!

tadasant commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

		@@ -0,0 +1,297 @@
		// Package ratelimit provides IP-based rate limiting middleware for HTTP servers.

Add IP-based rate limiting to protect against abuse #833

Add IP-based rate limiting to protect against abuse #833

Conversation

tadasant commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Configuration Options

Implementation Details

Test plan

Known Limitations

Uh oh!

pree-dew Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pree-dew Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

pree-dew Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

pree-dew Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

pree-dew Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

rdimitrov left a comment

Choose a reason for hiding this comment

Uh oh!

Avish34 commented Dec 11, 2025

Uh oh!

domdomegg Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

domdomegg Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

domdomegg Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

domdomegg left a comment

Choose a reason for hiding this comment

Uh oh!

tadasant commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tadasant commented Dec 9, 2025 •

edited

Loading

pree-dew Dec 10, 2025 •

edited

Loading