API Rate Limiting

Version: 2.0.0 Last Updated: 2025-01-24 Status: Fully Implemented

Overview

The TMI API implements rate limiting to protect against abuse, ensure fair resource allocation, and maintain service availability. Rate limits are applied at multiple scopes depending on the endpoint category and authentication status.

This document provides comprehensive details about the rate limiting strategy documented in the OpenAPI specification via x-rate-limit extensions.

Rate Limiting Strategy

TMI uses a tiered rate limiting approach with five distinct tiers:

Tier	Name	Scope	Configurable	Endpoint Count
1	Public Discovery	IP	No	5
2	Auth Flows	Multi-scope	No	9
3	Resource Operations	User	Yes	112
4	Webhooks	User	Yes	7
5	Addon Invocations	User	Yes (DB)	3

Design Principles

Unauthenticated endpoints use IP-based rate limiting
Authenticated endpoints use user-based rate limiting (extracted from JWT subject)
Auth flow endpoints use multi-scope limiting to balance security and usability
Webhook endpoints leverage existing database-backed quota system
Configurable limits allow per-user customization for resource operations and webhooks

Tier Definitions

Tier 1: Public Discovery

Applies to: Unauthenticated endpoints that provide API metadata and discovery information.

Endpoints:

GET / - API information
GET /.well-known/openid-configuration - OpenID configuration
GET /.well-known/oauth-authorization-server - OAuth metadata
GET /.well-known/jwks.json - JSON Web Key Set
GET /.well-known/oauth-protected-resource - Protected resource metadata

Rate Limit Configuration:

scope: ip
tier: public-discovery
limits:
  - type: requests_per_minute
    default: 10
    configurable: false
    tracking_method: Source IP address

Rationale:

These endpoints are cacheable and low-cost
Low limit (10/min) prevents excessive polling
IP-based tracking is appropriate for unauthenticated access

Tier 2: Auth Flows

Applies to: OAuth 2.0 and SAML 2.0 authentication endpoints.

Endpoints:

OAuth: /oauth2/authorize, /oauth2/callback, /oauth2/token, /oauth2/refresh, /oauth2/introspect
SAML: /saml/login, /saml/acs, /saml/slo (GET and POST)

Rate Limit Configuration:

strategy: multi-scope
tier: auth-flows
scopes:
  - name: session
    limits:
      - type: requests_per_minute
        default: 5
        configurable: false
        tracking_method: OAuth state parameter or SAML request ID
  - name: ip
    limits:
      - type: requests_per_minute
        default: 100
        configurable: false
        tracking_method: Source IP address
  - name: user_identifier
    limits:
      - type: attempts_per_hour
        default: 10
        configurable: false
        tracking_method: login_hint parameter or email address
enforcement: Most restrictive limit applies

Multi-Scope Enforcement:

Auth flow endpoints use three concurrent rate limit scopes:

Session Scope (5 requests/minute)
- Prevents individual browser sessions from hammering the endpoint
- Tracked via OAuth state parameter or SAML request ID
- Protects against misconfigured clients or tight retry loops
IP Scope (100 requests/minute)
- Prevents DoS from single IP address
- High limit allows large organizations (corporate NAT, universities)
- Addresses shared IP concern for multi-user applications
User Identifier Scope (10 attempts/hour)
- Prevents credential stuffing attacks on specific accounts
- Tracked via login_hint parameter (OAuth) or email/username (form inputs)
- Independent of session or IP for maximum protection

Example Scenarios:

Scenario	Session Limit	IP Limit	User Limit	Result
Single user, normal login	1/min	1/min	1/hour	Allowed
User refreshing page rapidly	6/min	6/min	6/hour	Blocked (session limit)
Corporate office (100 users)	1/min each	100/min total	1/hour each	Allowed
Attacker trying alice@example.com	5/min	5/min	11/hour	Blocked (user limit)
Distributed botnet	Varies	Varies	11/hour per user	Blocked (user limit)

Rationale:

Single IP limit alone would block legitimate users in shared environments
Session tracking prevents tight retry loops
User identifier tracking prevents account takeover attempts
Most restrictive limit applies - any scope hitting its limit blocks the request

Tier 3: Resource Operations

Applies to: All authenticated endpoints for threat models, diagrams, users, and collaboration.

Endpoints:

User management: /me, /oauth2/userinfo
Threat models: /threat_models/*
Diagrams: /threat_models/{id}/diagrams/*
Sub-resources: Assets, threats, documents, notes, repositories, metadata
Collaboration: /me/sessions

Rate Limit Configuration:

scope: user
tier: resource-operations
limits:
  - type: requests_per_minute
    default: 1000
    configurable: true
    quota_source: user_api_quotas

User-Based Tracking:

Rate limit applied per JWT subject (user ID)
Default: 1000 requests/minute per user
Configurable: Operators can customize limits per user via database

Quota Source:

Table: user_api_quotas
Schema includes:
- user_internal_uuid (UUID, primary key, foreign key to users)
- max_requests_per_minute (INT, default 100 in DB, 1000 in code)
- max_requests_per_hour (INT, default NULL)
- created_at, modified_at (timestamps)

Rationale:

1000 req/min supports interactive UI usage and reasonable automation
User-based tracking ensures fair allocation across all users
Configurability allows VIP users, integrations, or CI/CD to have higher limits
Existing pattern from webhook quotas ensures consistency

Tier 4: Webhooks

Applies to: Webhook subscription management and delivery history.

Endpoints:

/webhooks/subscriptions (GET, POST)
/webhooks/subscriptions/{id} (GET, DELETE)
/webhooks/subscriptions/{id}/test (POST)
/webhooks/deliveries (GET)
/webhooks/deliveries/{id} (GET)

Rate Limit Configuration:

scope: user
tier: webhooks
limits:
  - type: subscription_requests_per_minute
    default: 10
    configurable: true
    quota_source: webhook_quotas.max_subscription_requests_per_minute
  - type: subscription_requests_per_day
    default: 20
    configurable: true
    quota_source: webhook_quotas.max_subscription_requests_per_day
  - type: events_per_minute
    default: 12
    configurable: true
    quota_source: webhook_quotas.max_events_per_minute
  - type: max_subscriptions
    default: 10
    configurable: true
    quota_source: webhook_quotas.max_subscriptions

Multiple Rate Limits:

Webhook endpoints enforce four distinct limits:

Subscription Requests Per Minute (10/min)
- Applies to: POST, DELETE on /webhooks/subscriptions
- Prevents rapid subscription churn
Subscription Requests Per Day (20/day)
- Applies to: POST, DELETE on /webhooks/subscriptions
- Prevents subscription quota farming
Events Per Minute (12/min)
- Applies to: Webhook event publications (not HTTP API calls)
- Limits rate of events sent to user's subscriptions
Max Subscriptions (10 total)
- Static limit on number of active subscriptions per user
- Prevents resource exhaustion

Existing Implementation:

Webhook rate limiting is fully implemented:

Database table: webhook_quotas (see docs/reference/legacy-migrations/002_business_domain.up.sql)
Rate limiter: api/webhook_rate_limiter.go
Storage: Redis sorted sets for sliding window algorithm
Tests: api/webhook_rate_limiter_test.go

Rationale:

Multiple limits provide granular control over webhook usage
Database-backed quotas proven effective in implementation
Configurable limits support different subscription tiers
Event publication limit prevents webhook spam

Tier 5: Addon Invocations

Applies to: Add-on invocation endpoints for executing custom code against threat models.

Endpoints:

/addons/{addon_id}/invoke (POST)
/addons/invocations/{invocation_id} (GET)
/addons/invocations/{invocation_id} (DELETE)

Rate Limit Configuration:

scope: user
tier: addon-invocations
limits:
  - type: max_active_invocations
    default: 3
    configurable: true
    quota_source: addon_invocation_quotas.max_active_invocations
  - type: invocations_per_hour
    default: 10
    configurable: true
    quota_source: addon_invocation_quotas.max_invocations_per_hour
tracking_method: Sliding window with Redis sorted sets

Enforcement Details:

Active Invocation Limit (3 concurrent)
- Prevents users from running too many addons simultaneously
- Checked before creating new invocation
- Releases when invocation completes or times out
Hourly Rate Limit (10/hour)
- Sliding window using Redis sorted sets
- Prevents addon abuse and resource exhaustion
- Old entries automatically cleaned up

Database Schema:

Table: addon_invocation_quotas

CREATE TABLE IF NOT EXISTS addon_invocation_quotas (
    owner_internal_uuid UUID PRIMARY KEY,
    max_active_invocations INT NOT NULL DEFAULT 1,
    max_invocations_per_hour INT NOT NULL DEFAULT 10,
    created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    modified_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (owner_internal_uuid) REFERENCES users(internal_uuid) ON DELETE CASCADE
);

Implementation Status:

Database table created
GlobalAddonInvocationQuotaStore initialized
GlobalAddonRateLimiter created and integrated
Rate limiting enforced in addon invocation handlers
Admin API endpoints for managing quotas implemented

Rationale:

Addons execute custom code and consume significant resources
Single concurrent invocation prevents resource exhaustion
Hourly limit prevents abuse while allowing reasonable automation
Database-backed quotas allow per-user customization for power users

Multi-Scope Rate Limiting

Overview

Multi-scope rate limiting applies multiple independent rate limits to a single request, enforcing the most restrictive limit. This approach balances security with usability.

How It Works

For each request to an auth flow endpoint:

Extract identifiers from request:
- Session ID (OAuth state or SAML request ID)
- Source IP address
- User identifier (login_hint, email, or username)
Check all scopes against their respective limits:
- Session: 5 requests/minute
- IP: 100 requests/minute
- User: 10 attempts/hour
Enforce most restrictive limit:
- If ANY scope exceeds its limit -> Return 429
- If ALL scopes are under limit -> Allow request
Record request in all applicable scopes

Tracking Mechanisms

Session Tracking:

OAuth: Extract from state query parameter
SAML: Extract from SAMLRequest or RelayState
Lifespan: Typically 5-15 minutes (OAuth spec)
Storage: Redis sorted set per session ID

IP Tracking:

Source IP from X-Forwarded-For (if trusted proxy) or direct connection
Storage: Redis sorted set per IP address

User Identifier Tracking:

OAuth: login_hint query parameter (optional)
Form login: Username or email field
Only tracked when identifier is provided
Storage: Redis sorted set per normalized identifier (lowercase email)

Redis Key Patterns

# Session scope
ratelimit:session:{state_or_request_id}:minute

# IP scope
ratelimit:ip:{ip_address}:minute

# User identifier scope
ratelimit:user:{normalized_email}:hour

Graceful Degradation

If Redis is unavailable:

Session and user limits: Disabled (logs warning)
IP limit: Falls back to in-memory tracking (loses distributed state)
Service continues: Rate limiting disabled to maintain availability

Configurable Quotas

Overview

Tiers 3 (Resource Operations), 4 (Webhooks), and 5 (Addon Invocations) support per-user configurable quotas stored in PostgreSQL. This allows operators to:

Increase limits for VIP users or integrations
Implement tiered subscription plans
Grant higher quotas to CI/CD systems
Throttle specific users if needed

Default Values

Quota Type	Field	Default Value
User API	`max_requests_per_minute`	1000
User API	`max_requests_per_hour`	60000 (optional)
Webhook	`max_subscriptions`	10
Webhook	`max_events_per_minute`	12
Webhook	`max_subscription_requests_per_minute`	10
Webhook	`max_subscription_requests_per_day`	20
Addon Invocation	`max_active_invocations`	3
Addon Invocation	`max_invocations_per_hour`	10

Admin API Endpoints

TMI provides comprehensive quota management for administrators to control resource limits per user. All quota endpoints require administrator privileges.

User API Quota Endpoints

GET    /admin/quotas/users              # List all custom user API quotas
GET    /admin/quotas/users/{user_id}    # Get user's API quota
PUT    /admin/quotas/users/{user_id}    # Create/update user's API quota
DELETE /admin/quotas/users/{user_id}    # Delete quota (revert to defaults)

Example: Set higher quota for power user

curl -X PUT \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"max_requests_per_minute": 5000, "max_requests_per_hour": 300000}' \
  "https://api.example.com/admin/quotas/users/550e8400-e29b-41d4-a716-446655440000"

Webhook Quota Endpoints

GET    /admin/quotas/webhooks              # List all custom webhook quotas
GET    /admin/quotas/webhooks/{user_id}    # Get user's webhook quota
PUT    /admin/quotas/webhooks/{user_id}    # Create/update webhook quota
DELETE /admin/quotas/webhooks/{user_id}    # Delete quota (revert to defaults)

Example: Set webhook quota

curl -X PUT \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "max_subscriptions": 20,
    "max_events_per_minute": 24,
    "max_subscription_requests_per_minute": 20,
    "max_subscription_requests_per_day": 40
  }' \
  "https://api.example.com/admin/quotas/webhooks/550e8400-e29b-41d4-a716-446655440000"

Addon Invocation Quota Endpoints

GET    /admin/quotas/addons              # List all custom addon invocation quotas
GET    /admin/quotas/addons/{user_id}    # Get user's addon invocation quota
PUT    /admin/quotas/addons/{user_id}    # Create/update addon invocation quota
DELETE /admin/quotas/addons/{user_id}    # Delete quota (revert to defaults)

Example: Set addon invocation quota

curl -X PUT \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"max_active_invocations": 5, "max_invocations_per_hour": 100}' \
  "https://api.example.com/admin/quotas/addons/550e8400-e29b-41d4-a716-446655440000"

Common Response Codes

Code	Description
`200 OK`	Request successful, returns quota data
`201 Created`	Quota created successfully
`204 No Content`	Quota deleted successfully
`400 Bad Request`	Invalid request body or user ID
`401 Unauthorized`	Missing or invalid authentication
`403 Forbidden`	User is not an administrator
`404 Not Found`	Quota or user not found

Best Practices

List Before Modifying: Use list endpoints to discover which users have custom quotas
Pagination: Always use limit and offset parameters for large result sets
Default Values: Only set custom quotas when needed; defaults work for most users
Documentation: Document why specific users have custom quotas

Quota Caching

Overview

To avoid database queries on every API request, TMI implements an in-memory quota cache with automatic expiration and invalidation.

Cache Implementation

Global Instance: GlobalQuotaCache (initialized in main.go)

Configuration:

TTL: 60 seconds (configurable)
Storage: In-memory maps with read-write mutex
Cleanup: Automatic background goroutine removes expired entries

Cached Data:

User API quotas (map[string]*cachedUserAPIQuota)
Webhook quotas (map[string]*cachedWebhookQuota)

Cache Behavior

On Cache Miss:

Fetch quota from database via store interface
Store in cache with expiration timestamp (now + TTL)
Return quota to caller

On Cache Hit:

Check if entry is expired (time.Now().Before(expiresAt))
If not expired: Return cached quota
If expired: Fetch from database and update cache

Cache Invalidation

Per-User Invalidation (Primary):

GlobalQuotaCache.InvalidateUserAPIQuota(userID)  // Removes specific user's API quota
GlobalQuotaCache.InvalidateWebhookQuota(userID)  // Removes specific user's webhook quota

Automatic Invalidation:

Called automatically when admin updates user quota via PUT endpoint
Called automatically when admin deletes user quota via DELETE endpoint
Ensures quota changes take effect immediately (within cache check)

Global Invalidation (Available but not exposed):

GlobalQuotaCache.InvalidateAll()  // Clears all cached quotas

Performance Impact

Without Caching:

Database query on every API request
~5-10ms latency per request
Increased database load

With Caching:

Database query only on cache miss (every 60 seconds per user)
~0.1ms latency for cache hits
99%+ reduction in database queries

Trade-off:

Quota changes take up to 60 seconds to propagate (or immediate with invalidation)
Small memory overhead (negligible for typical user counts)

Implementation Details

Location: api/quota_cache.go

Key Features:

Thread-safe with sync.RWMutex
Automatic cleanup goroutine prevents memory leaks
Graceful shutdown via Stop() method
Falls back to database on cache failure

Rate Limit Headers

When a rate limit is enforced, the API returns HTTP 429 with informative headers:

Response Headers

Header	Type	Description	Example
`X-RateLimit-Limit`	Integer	Maximum requests allowed in window	`100`
`X-RateLimit-Remaining`	Integer	Requests remaining in current window	`0`
`X-RateLimit-Reset`	Integer	Unix timestamp when window resets	`1640000000`
`Retry-After`	Integer	Seconds to wait before retrying	`60`

Example 429 Response

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1732233600
Retry-After: 45

{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded: 100 requests per minute. Retry after 45 seconds.",
  "details": {
    "limit": 100,
    "window": "minute",
    "retry_after": 45
  }
}

Multi-Scope Headers

For auth flow endpoints with multi-scope limits, headers reflect the most restrictive scope:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1732233660
Retry-After: 30
X-RateLimit-Scope: session

{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded: 5 requests per minute per session. Retry after 30 seconds.",
  "details": {
    "limit": 5,
    "scope": "session",
    "window": "minute",
    "retry_after": 30
  }
}

Client Integration

Best Practices

Always check rate limit headers in responses (even 200 OK)
Implement exponential backoff when receiving 429
Respect Retry-After header before retrying
Pre-emptively throttle when X-RateLimit-Remaining is low

Sample Client Code

Python

import requests
import time

def make_request_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
            continue

        # Check remaining quota
        remaining = int(response.headers.get('X-RateLimit-Remaining', 100))
        if remaining < 10:
            print(f"Warning: Only {remaining} requests remaining")

        return response

    raise Exception("Max retries exceeded")

Go

func makeRequestWithRetry(url string, token string, maxRetries int) (*http.Response, error) {
    client := &http.Client{}

    for attempt := 0; attempt < maxRetries; attempt++ {
        req, _ := http.NewRequest("GET", url, nil)
        req.Header.Set("Authorization", "Bearer " + token)

        resp, err := client.Do(req)
        if err != nil {
            return nil, err
        }

        if resp.StatusCode == 429 {
            retryAfter, _ := strconv.Atoi(resp.Header.Get("Retry-After"))
            if retryAfter == 0 {
                retryAfter = 60
            }
            log.Printf("Rate limited. Waiting %d seconds...", retryAfter)
            time.Sleep(time.Duration(retryAfter) * time.Second)
            continue
        }

        // Check remaining quota
        remaining, _ := strconv.Atoi(resp.Header.Get("X-RateLimit-Remaining"))
        if remaining < 10 {
            log.Printf("Warning: Only %d requests remaining", remaining)
        }

        return resp, nil
    }

    return nil, fmt.Errorf("max retries exceeded")
}

JavaScript/TypeScript

async function makeRequestWithRetry(
    url: string,
    token: string,
    maxRetries: number = 3
): Promise<Response> {
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        const response = await fetch(url, {
            headers: { 'Authorization': `Bearer ${token}` }
        });

        if (response.status === 429) {
            const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
            console.log(`Rate limited. Waiting ${retryAfter} seconds...`);
            await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
            continue;
        }

        // Check remaining quota
        const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '100');
        if (remaining < 10) {
            console.warn(`Warning: Only ${remaining} requests remaining`);
        }

        return response;
    }

    throw new Error('Max retries exceeded');
}

Database Schema

Existing Tables

webhook_quotas

See docs/reference/legacy-migrations/002_business_domain.up.sql for complete schema.

Purpose: Store per-user webhook rate limits and subscription quotas.

Key Fields:

owner_id - User UUID (primary key)
max_subscriptions - Maximum active subscriptions (default: 10)
max_events_per_minute - Event publication rate (default: 12)
max_subscription_requests_per_minute - API request rate (default: 10)
max_subscription_requests_per_day - Daily API quota (default: 20)

user_api_quotas

Purpose: Store per-user API rate limits for resource operations.

Status: Fully implemented (table exists in database)

Schema:

CREATE TABLE IF NOT EXISTS user_api_quotas (
    user_internal_uuid UUID PRIMARY KEY,
    max_requests_per_minute INT NOT NULL DEFAULT 100,
    max_requests_per_hour INT DEFAULT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    modified_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_internal_uuid) REFERENCES users(internal_uuid) ON DELETE CASCADE
);

addon_invocation_quotas

Purpose: Store per-user addon invocation rate limits.

Status: Fully implemented (table exists in database)

Schema:

CREATE TABLE IF NOT EXISTS addon_invocation_quotas (
    owner_internal_uuid UUID PRIMARY KEY,
    max_active_invocations INT NOT NULL DEFAULT 1,
    max_invocations_per_hour INT NOT NULL DEFAULT 10,
    created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    modified_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (owner_internal_uuid) REFERENCES users(internal_uuid) ON DELETE CASCADE
);

Implementation Notes

Current Status

Fully Implemented:

OpenAPI specification with x-rate-limit extensions
429 response component with proper headers
API rate limiter (Redis-based sliding window)
Webhook rate limiter (Redis-based sliding window)
Addon invocation rate limiter (active + hourly limits)
user_api_quotas database table and store
webhook_quotas database table and store
addon_invocation_quotas database table and store
Quota caching system with 60s TTL
Cache invalidation on quota updates
Rate limiting middleware (RateLimitMiddleware)
Middleware registered in router (main.go)
Admin API for user API quota management
Admin API for webhook quota management
Admin API for addon invocation quota management
Rate limit headers (X-RateLimit-*)
Comprehensive test coverage for rate limiting

Partially Implemented:

Multi-scope rate limiter for auth flows (code exists, basic integration)
IP-based rate limiting for public endpoints (code exists)

Middleware Integration

Rate Limit Middleware (api/rate_limit_middleware.go):

Registered globally in router: r.Use(api.RateLimitMiddleware(apiServer))
Applied to all authenticated endpoints
Skips public discovery endpoints (/, /.well-known/*)
Skips auth flow endpoints (OAuth, SAML)
Extracts user ID from JWT context
Checks per-minute and per-hour limits
Returns HTTP 429 with retry-after on limit exceeded
Adds rate limit headers to all responses
Fails open on errors (allows request, logs warning)

IP Rate Limit Middleware (api/ip_and_auth_rate_limit_middleware.go):

Registered: r.Use(api.IPRateLimitMiddleware(apiServer))
Protects public endpoints from IP-based abuse
10 requests/minute per IP address
Uses Redis sorted sets for distributed tracking

Auth Flow Rate Limit Middleware (api/ip_and_auth_rate_limit_middleware.go):

Registered: r.Use(api.AuthFlowRateLimitMiddleware(apiServer))
Applies to OAuth and SAML endpoints
Multi-scope tracking (session, IP, user identifier)
Prevents credential stuffing and auth flow abuse

Technology Stack

Rate Limiting:

Algorithm: Sliding window (token bucket alternative)
Storage: Redis sorted sets (ZSET)
Key Pattern: ratelimit:{scope}:{identifier}:{window}
TTL: Window duration + 60 seconds buffer

Database:

Storage: PostgreSQL
Tables: webhook_quotas, user_api_quotas, addon_invocation_quotas
Access: Via store interface pattern

Graceful Degradation:

Redis unavailable -> Rate limiting disabled, logs warning
Database unavailable -> Falls back to default quotas
Maintains service availability over strict enforcement

Performance Considerations

Redis Operations:

Rate limit checks: 2-3 Redis commands (ZREMRANGEBYSCORE, ZCOUNT, ZADD)
Pipelined for atomicity and performance
Expected latency: <5ms per check

Database Queries:

Quota lookups cached in-memory (TTL: 60 seconds)
No database query on every request
Quota changes take effect within 60 seconds

Sliding Window Cleanup:

Automatic via ZREMRANGEBYSCORE before each check
TTL ensures old keys are eventually cleaned up
No separate cleanup job needed

Security Considerations

Distributed Attacks:

Multi-scope limiting prevents single-vector attacks
User identifier tracking stops credential stuffing
IP limiting prevents single-IP DoS

Quota Bypass:

JWT validation ensures user identity
Redis atomic operations prevent race conditions
Database foreign key constraints prevent orphaned quotas

Information Disclosure:

Rate limit headers reveal system limits (acceptable for public API)
Error messages don't expose internal implementation details
Quota configuration not exposed via user-facing APIs

References

Standards and RFCs

RFC 6749 - OAuth 2.0 Authorization Framework
RFC 6585 - HTTP Status Code 429 (Too Many Requests)
IETF Draft: RateLimit Headers - Standard rate limit headers

Tools and Libraries

Redis - In-memory data store for rate limiting
go-redis - Go Redis client
oapi-codegen - OpenAPI code generation

Changelog

2.0.0 (2025-01-24)

Added Tier 5: Addon Invocations with active and hourly limits
Updated default values to match current implementation
Added comprehensive admin API documentation for all quota types
Added quota caching section
Marked implementation status as Fully Implemented
Migrated to wiki format

1.0.0 (2025-11-21)

Initial specification
Four-tier rate limiting strategy
Multi-scope auth flow protection
Database-backed configurable quotas
Comprehensive client integration guide

API Rate Limiting

API Rate Limiting

Overview

Table of Contents

Rate Limiting Strategy

Design Principles

Tier Definitions

Tier 1: Public Discovery

Tier 2: Auth Flows

Tier 3: Resource Operations

Tier 4: Webhooks

Tier 5: Addon Invocations

Multi-Scope Rate Limiting

Overview

How It Works

Tracking Mechanisms

Redis Key Patterns

Graceful Degradation

Configurable Quotas

Overview

Default Values

Admin API Endpoints

User API Quota Endpoints

Webhook Quota Endpoints

Addon Invocation Quota Endpoints

Common Response Codes

Best Practices

Quota Caching

Overview

Cache Implementation

Cache Behavior

Cache Invalidation

Performance Impact

Implementation Details

Rate Limit Headers

Response Headers

Example 429 Response

Multi-Scope Headers

Client Integration

Best Practices

Sample Client Code

Python

Go

JavaScript/TypeScript

Database Schema

Existing Tables

webhook_quotas

user_api_quotas

addon_invocation_quotas

Implementation Notes

Current Status

Middleware Integration

Technology Stack

Performance Considerations

Security Considerations

References

Related Documentation

Standards and RFCs

Tools and Libraries

Changelog

2.0.0 (2025-01-24)

1.0.0 (2025-11-21)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!