feat(rate_limiter): add SlidingWindowRateLimiter for strict per-minute caps by dive2tech · Pull Request #20799 · run-llama/llama_index

dive2tech · 2026-02-25T17:44:21Z

Summary

Adds a new rate limiter implementation, SlidingWindowRateLimiter, as an alternative to the existing TokenBucketRateLimiter.

Motivation

Token-bucket limiters allow bursts at the start of each window. Some APIs enforce strict limits over a rolling 60-second window with no burst allowance. This implementation enforces a strict sliding window: only requests (or tokens) within the last 60 seconds count toward the limit.

Changes

llama_index/core/rate_limiter.py: New SlidingWindowRateLimiter class implementing BaseRateLimiter with:
- Optional requests-per-minute (RPM) and tokens-per-minute (TPM) limits (at least one required)
- Thread-safe sync acquire() and async async_acquire()
- Pruning of out-of-window entries and blocking until capacity is available
tests/test_rate_limiter.py: Tests for creation, validation, blocking behavior, pruning, TPM limiting, async/concurrent behavior, and LLM/embedding integration.

Usage

from llama_index.core.rate_limiter import SlidingWindowRateLimiter

limiter = SlidingWindowRateLimiter(requests_per_minute=60)
llm = SomeLLM(rate_limiter=limiter)

All existing rate limiter tests pass; 13 new tests added for the sliding-window implementation.

…e caps Add SlidingWindowRateLimiter as an alternative to TokenBucketRateLimiter. It enforces a strict sliding 60-second window for RPM/TPM, with no burst at window boundaries. Includes full sync/async support and tests. Co-authored-by: Cursor <cursoragent@cursor.com>

Extend SlidingWindowRateLimiter with optional request_burst and token_burst parameters so callers can configure limited burst headroom while keeping a sliding 60s window model. Includes tests covering request and token bursts. Made-with: Cursor

dive2tech · 2026-02-25T20:27:59Z

Hi, @rootInfluence . Thanks for your feedback.
I have considered your suggestions and remake the code.

llama-index-core/llama_index/core/rate_limiter.py

llama-index-core/tests/test_rate_limiter.py

…window Replace the SlidingWindowRateLimiter __init__ override with a model_validator that enforces the RPM/TPM requirement and remove a redundant instance-type test. This keeps the Pydantic API ergonomic while maintaining invariants. Made-with: Cursor

dive2tech · 2026-02-26T12:45:18Z

Hi @AstraBert,
Thank you for your feedback.
I’ve updated the code as per your suggestions. Could you kindly review it?
I appreciate your time and assistance.

AstraBert · 2026-02-27T12:38:37Z

Linting is failing, you should ensure linting by running:

uv pip install pre-commit
pre-commit install
pre-commit run -a

From the root folder of the llama_index repo

Apply ruff/black formatting to rate_limiter module so it passes the standard pre-commit hooks used in this repository. Made-with: Cursor

dive2tech · 2026-02-27T13:38:12Z

Fixed lint error, ready to merge

Add a small compatibility shim for NoneType so llama_index.core.base.llms.types imports cleanly on Python 3.9 where types.NoneType is not available. Made-with: Cursor

Made-with: Cursor

dive2tech · 2026-03-02T11:33:32Z

I've fixed the CI bot test error. Please try running it again.
Thank you for your time.

AstraBert · 2026-03-02T11:33:58Z

llama-index-core/llama_index/core/base/llms/types.py

+# NOTE:
+# Python 3.9 does not expose `types.NoneType`, so we define a local alias that
+# works across all supported versions instead of importing it from `types`.
+NoneType = type(None)


It is not necessary to stabilize integrations across versions: integrations will most probably fail all the times when a modification is made on core, and this is mostly due to flakiness in their own tests. Especially for 3.9, we are preparing to drop support, so there is no need to adapt

Please revert the last commit

AstraBert · 2026-03-02T11:35:18Z

I will merge this one once the last commit is reverted and CI is done. If linting, type-checking and the core tests on 3.12 and 3.14 are passing, this is all I need to merge, it's not a problem if other CI checks fail

This reverts commit ba84bd5.

dive2tech · 2026-03-02T12:10:03Z

Hi, @AstraBert, Thanks for your feedback.
I just reverted last commit.
Unit Testing / test 3.12 and core-py314 is passed. so is it ready to merge?

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 25, 2026

AstraBert reviewed Feb 26, 2026

View reviewed changes

llama-index-core/llama_index/core/rate_limiter.py Outdated Show resolved Hide resolved

llama-index-core/tests/test_rate_limiter.py Outdated Show resolved Hide resolved

AstraBert approved these changes Feb 27, 2026

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 27, 2026

style(rate_limiter): apply pre-commit formatting

ced4679

Apply ruff/black formatting to rate_limiter module so it passes the standard pre-commit hooks used in this repository. Made-with: Cursor

dive2tech requested a review from AstraBert February 27, 2026 13:38

AstraBert approved these changes Mar 2, 2026

View reviewed changes

dive2tech added 3 commits March 2, 2026 12:41

fix(py39): provide NoneType shim for llm types

f29455e

Add a small compatibility shim for NoneType so llama_index.core.base.llms.types imports cleanly on Python 3.9 where types.NoneType is not available. Made-with: Cursor

Merge branch 'main' into feat/sliding-window-rate-limiter

cf8a78f

test: stabilize integrations across python versions

ba84bd5

Made-with: Cursor

AstraBert reviewed Mar 2, 2026

View reviewed changes

Revert "test: stabilize integrations across python versions"

692f90d

This reverts commit ba84bd5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rate_limiter): add SlidingWindowRateLimiter for strict per-minute caps#20799

feat(rate_limiter): add SlidingWindowRateLimiter for strict per-minute caps#20799
dive2tech wants to merge 8 commits intorun-llama:mainfrom
dive2tech:feat/sliding-window-rate-limiter

dive2tech commented Feb 25, 2026

Uh oh!

dive2tech commented Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

dive2tech commented Feb 26, 2026

Uh oh!

AstraBert commented Feb 27, 2026

Uh oh!

dive2tech commented Feb 27, 2026

Uh oh!

dive2tech commented Mar 2, 2026

Uh oh!

AstraBert Mar 2, 2026

Uh oh!

AstraBert Mar 2, 2026

Uh oh!

AstraBert commented Mar 2, 2026

Uh oh!

dive2tech commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dive2tech commented Feb 25, 2026

Summary

Motivation

Changes

Usage

Uh oh!

dive2tech commented Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

dive2tech commented Feb 26, 2026

Uh oh!

AstraBert commented Feb 27, 2026

Uh oh!

dive2tech commented Feb 27, 2026

Uh oh!

dive2tech commented Mar 2, 2026

Uh oh!

AstraBert Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

AstraBert Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

AstraBert commented Mar 2, 2026

Uh oh!

dive2tech commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants