Performance Issue in RAG Pipeline with LangChain, ChromaDB, and GPT-4o #31014
Unanswered
Adarsh-AMT
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
We are currently working on a Retrieval-Augmented Generation (RAG) pipeline using LangChain, ChromaDB, and GPT-4o. The pipeline is functionally correct, but we are experiencing performance issues, particularly during the similarity search step.
Pipeline Details :
Similarity Search (ChromaDB)
Average Time Taken: approximately 2.94 seconds (range: 1.5s – 5.0s)
Model Inference (GPT-4o via LangChain)
Average Time Taken: approximately 3.39 seconds (range: 3.1s – 4.0s)
Sample Execution Timings
Question One
Similarity Search Time: 5.02 seconds
Model Response Time: 3.13 seconds
Total Time: 9.02 seconds
Question Two
Similarity Search Time: 2.24 seconds
Model Response Time: 3.20 seconds
Total Time: 6.78 seconds
Question Three
Similarity Search Time: 3.00 seconds
Model Response Time: 3.20 seconds
Total Time: 6.47 seconds
Question Four
Similarity Search Time: 1.50 seconds
Model Response Time: 4.03 seconds
Total Time: 6.00 seconds
We are looking for any optimum way to resolve the performance issue and improve overall response time. Suggestions related to retrieval optimization, ChromaDB configuration, caching strategies, or model handling are welcome.
System Info
.
Beta Was this translation helpful? Give feedback.
All reactions