-
Notifications
You must be signed in to change notification settings - Fork 179
Description
OpenAIClientAsync's documentation states:
This client performs best when you create a single instance and reuse it for all interactions
with the REST API. This is because each client holds its own connection pool and thread pools.
Reusing connections and threads reduces latency and saves memory. The client also handles rate
limiting per client. This means that creating and using multiple instances at the same time will
not respect rate limits. (emphasis mine)
However I find no evidence on rate limits actually being used proactively as the text suggests. I can only find evidence of reactive retry, when encountering 429 errors it automatically retries with an exponential backoff. However, that is request specific, not global to the client.
Is my understanding incorrect here or is this a documentation error? Thanks in advance.