A complete guide to rate limiting algorithms, implementations, and best practices for building robust APIs and distributed systems.
- Introduction
- Why Rate Limiting?
- Core Concepts
- Algorithms
- Implementation Examples
- Best Practices
- Distributed Rate Limiting
- Testing
- Resources
Rate limiting is a critical technique for controlling the rate of requests sent or received by a system. It helps protect services from abuse, ensures fair resource allocation, and maintains system stability under high load.
This repository provides:
- Detailed explanations of rate limiting algorithms
- Production-ready code examples in multiple languages
- Best practices and patterns
- Distributed system considerations
- Testing strategies
Rate limiting serves several crucial purposes:
- DoS Protection: Prevent abuse and denial-of-service attacks
- Cost Control: Manage infrastructure costs by controlling usage
- Fair Usage: Ensure equitable resource distribution among users
- Service Quality: Maintain consistent performance for all users
- API Monetization: Enable tiered pricing models
- Resource Management: Protect downstream services and databases
- Rate: Maximum number of requests allowed in a time window
- Window: Time period for measuring the rate (e.g., per second, per minute)
- Identifier: Key used to track rate (e.g., user ID, IP address, API key)
- Quota: Total allowance over a longer period
- Burst: Temporary spike in traffic above the normal rate
When a rate limit is exceeded, systems typically:
- Return HTTP 429 (Too Many Requests)
- Include headers:
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset - Optionally include
Retry-Afterheader
This repository covers the following rate limiting algorithms:
- Algorithm Details
- Implementation
- Best for: Smooth traffic with occasional bursts
- Algorithm Details
- Implementation
- Best for: Enforcing strict output rates
- Algorithm Details
- Implementation
- Best for: Simple implementation, lower memory usage
- Algorithm Details
- Implementation
- Best for: Precise rate limiting without boundary issues
- Algorithm Details
- Implementation
- Best for: Balance between accuracy and efficiency
- Algorithm Details
- Implementation
- Best for: Limiting simultaneous active requests
Code examples are provided in multiple languages:
- Typescript: Express middleware, Redis-based
- Python: Using Redis, in-memory implementations
Each implementation includes:
- Complete working code
- Unit tests
- Performance benchmarks
- Configuration options
- Choose the right algorithm for your use case
- Communicate limits clearly through headers
- Implement graceful degradation
- Monitor and alert on rate limit hits
- Make limits configurable without code changes
- Consider different dimensions: per-user, per-IP, per-endpoint
- Use distributed rate limiting for multi-instance deployments
- Implement circuit breakers alongside rate limits
- Cache rate limit decisions when possible
- Log rate limit violations for analysis
- Provide webhooks or notifications for quota exhaustion
When running multiple application instances, you need distributed rate limiting:
- Redis-based: Centralized counter storage
- Memcached: Simple distributed caching
- Dedicated Services: Kong, Envoy, API Gateway
- Custom Solutions: Consistent hashing, gossip protocols
See Distributed Rate Limiting Guide for details.
- Load testing to verify limits
- Unit tests for algorithm correctness
- Integration tests for distributed scenarios
- Chaos testing for failure scenarios
See Testing Guide for examples.
.
βββ README.md
βββ docs/
β βββ algorithms/
β β βββ token-bucket.md β Complete token bucket guide
β β βββ leaky-bucket.md β Complete leaky bucket guide
β β βββ fixed-window.md β Complete fixed window guide
β β βββ sliding-window-log.md β Complete sliding window log guide
β β βββ sliding-window-counter.md β Complete sliding window counter guide
β β βββ concurrent-requests.md β Concurrent requests limiter guide
β βββ algorithm-comparison.md β Detailed algorithm comparison
β βββ getting-started.md β Quick start guide
β βββ distributed.md β Distributed rate limiting guide
β βββ best-practices.md β Production best practices
β βββ testing.md β Comprehensive testing guide
βββ implementations/
β βββ typescript/
β β βββ rate-limiter.ts β All 6 algorithms in TypeScript
β β βββ express-middleware.ts β Express.js middleware
β β βββ redis-adapter.ts β Redis distributed limiter
β β βββ examples.ts β 12+ usage examples
β β βββ rate-limiter.test.ts β Complete test suite
β β βββ package.json β NPM configuration
β β βββ tsconfig.json β TypeScript config
β βββ python/
β βββ rate_limiter.py β All algorithms in Python
β βββ redis_rate_limiter.py β Redis implementation
βββ examples/
β βββ api-gateway/
β β βββ server.ts β Complete API gateway example
β βββ microservices/
β β βββ services.ts β Microservices architecture example
β βββ web-application/
β βββ app.ts β Web application example
βββ benchmarks/
βββ index.ts β Performance benchmarks
- redis-cell - Redis module for rate limiting
- express-rate-limit - Node.js middleware
- golang.org/x/time/rate - Go rate limiter
Contributions are welcome (Need more Python implementations)
MIT License - see LICENSE file for details.
β Star this repo if you find it helpful!