Replies: 1 comment
-
🤖 Hi, Thanks for reaching out. It seems like you're dealing with memory issues when using sitemap loaders and Pinecone vector storage. Here are some strategies you can consider to optimize memory usage:
# Assume texts is a large list of texts to be added
batch_size = 1000 # Adjust this value based on your memory capacity
for i in range(0, len(texts), batch_size):
batch_texts = texts[i:i+batch_size]
vectorstore.add_texts(batch_texts)
# Assume ids_to_delete is a list of ids of vectors to be deleted
vectorstore.delete(ids=ids_to_delete)
# Assume you want to load URLs from a sitemap in blocks of 1 and load the first block
loader = SitemapLoader("https://langchain.readthedocs.io/sitemap.xml", blocksize=1, blocknum=0)
documents = loader.load() Remember to adjust the I hope this helps! Let me know if you have any other questions. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi I'm running in to memory issues using sitemap loaders and Pinecone vector storage.
My server is frequently crashing due to memory limit which is currently 16GB of ram. I don't really think the issue is retrieving all the pages the memory usage is pretty stable. But when I start indexing the server can crash.
Increasing the memory limit to 32GB was a quick and dirty solution. But is there any other way I can solve this using less memory.
Loader code
Vector store code
Beta Was this translation helpful? Give feedback.
All reactions