Replies: 2 comments 1 reply
-
🤖 Hey there, @aiwalter! Good to see you diving into LlamaIndex again. Always a pleasure to see familiar faces around here. Hope you're staying energized for coding! Based on the similar issues I found in the LlamaIndex repository, there are a couple of ways you can handle the One way is to manually embed your nodes one at a time and add a sleep time between each embedding. Here's an example of how you can do this: from llama_index.embeddings import OpenAIEmbedding
from llama_index.node_parser import SimpleNodeParser
node_parser = SimpleNodeParser.from_defaults()
embed_model = OpenAIEmbedding()
nodes = node_parser.get_nodes_from_documents(documents)
import time
for node in nodes:
node.embedding = embed_model.get_text_embedding(node.text)
time.sleep(21)
index = VectorStoreIndex(nodes, ....) This solution was suggested in the issue [Bug]: Warning raising "llama_index.llms.openai_utils:Retrying llama_index.embeddings.openai.get_embeddings". Another way is to update the retry policy to get OpenAI Embeddings for trial accounts. This solution was implemented in the pull request Update retry policy to get OpenAI Embeddings for trial accounts which closed the issue How to solve limit rate during However, please note that rate limit issues might also be due to OpenAI API availability issues, as mentioned in the issue Rate limit issue. In this case, there's not much that can be done on your side. You might want to try using a different model if the issue persists. I hope this helps! If you have any other questions, feel free to ask. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
I built a deco that can support here: def retry(max_attempts: int = 4, initial_delay: int = 5, exception: Exception = Exception):
"""A decorator that retries a function call with an exponential backoff if it raises a specified exception.
Parameters
----------
max_attempts: Maximum number of attempts for retrying the function.
initial_delay: Initial delay between retries, in seconds. This delay is doubled after each failure.
exception: The exception type that triggers a retry.
"""
def decorator_retry(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
attempts = 0
delay = initial_delay
while attempts < max_attempts:
try:
return func(*args, **kwargs)
except exception as e:
LOGGER.warn(f"Retry required, attempts {attempts}: {func.__name__}")
attempts += 1
if attempts >= max_attempts:
raise
time.sleep(delay)
delay *= 2 # Exponential backoff
return wrapper
return decorator_retry Then: @retry(exception=openai.RateLimitError)
def _embed_node(node: Node, embed_model: BaseEmbedding) -> Node:
"""Retry helper function."""
node.embedding = embed_model.get_text_embedding(node.text)
return node |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there a way to build a queue in case a
RateLimitError
happens so that its like waitin 60 seconds before going on with next batch?I keep getting embedding rate limit issues and would like to catch them by slowing down the number of tokens that are sent per minute by catching any exception and adding some sleep time
Beta Was this translation helpful? Give feedback.
All reactions