Skip to content

docs: clarify OpenAIClientAsync rate-limit note#729

Open
Rohan5commit wants to merge 1 commit intoopenai:mainfrom
Rohan5commit:docs/async-client-rate-limit-clarification-20260426
Open

docs: clarify OpenAIClientAsync rate-limit note#729
Rohan5commit wants to merge 1 commit intoopenai:mainfrom
Rohan5commit:docs/async-client-rate-limit-clarification-20260426

Conversation

@Rohan5commit
Copy link
Copy Markdown
Contributor

Summary

Clarify the OpenAIClientAsync documentation so it describes per-client retry behavior without implying proactive global rate limiting.

Related issue

Fixes #644

Guideline alignment

Read CONTRIBUTING.md and kept this to a single-file documentation comment change.

Validation

  • git diff --check

Copilot AI review requested due to automatic review settings April 26, 2026 03:09
@Rohan5commit Rohan5commit requested a review from a team as a code owner April 26, 2026 03:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates OpenAIClientAsync KDoc to remove the implication that the SDK performs proactive per-client rate limiting, and instead describe how behavior is scoped to the client instance while external rate limits are shared.

Changes:

  • Replaces “rate limiting per client” wording with a description focused on per-client retry behavior and shared external rate limits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +36 to +39
* Reusing connections and threads reduces latency and saves memory. Each client also manages its
* own retries independently, so reusing one client keeps that retry behavior, connection pool, and
* thread pools in one place while multiple clients may compete separately against the same external
* rate limits.
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new KDoc still reads a bit like “client-level” retry/rate-limit behavior ("manages its own retries independently"), but retries are implemented per request (reactive on e.g. 429) and there’s no cross-request/cross-client coordination. Consider wording this explicitly (e.g., retries are per request and the SDK does not proactively throttle or coordinate rate limiting across client instances) to avoid repeating the confusion from #644.

Suggested change
* Reusing connections and threads reduces latency and saves memory. Each client also manages its
* own retries independently, so reusing one client keeps that retry behavior, connection pool, and
* thread pools in one place while multiple clients may compete separately against the same external
* rate limits.
* Reusing connections and threads reduces latency and saves memory. Retries are applied per
* request, reactively based on API responses such as rate limits, and are not coordinated across
* requests or client instances. The SDK does not proactively throttle requests or coordinate rate
* limiting across multiple clients, so separate clients may still compete against the same
* external rate limits.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential misleading documentation on OpenAiClientAsync

3 participants