Skip to content

Commit 69bc345

Browse files
committed
pre-defined prompt
1 parent f5b4d4d commit 69bc345

File tree

1 file changed

+30
-13
lines changed

1 file changed

+30
-13
lines changed

tutorials/how-to-implement-rag/index.mdx

Lines changed: 30 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -264,29 +264,46 @@ This approach ensures that only new or modified documents are loaded into memory
264264
Storing both the chunk and its corresponding embedding allows for efficient document retrieval later.
265265
When a query is made, the RAG system will retrieve the most relevant embeddings, and the corresponding text chunks will be used to generate the final response.
266266

267-
### Query the RAG System
267+
### Query the RAG System with a pre-defined prompt template
268268

269-
Now, set up the RAG system to handle queries using RetrievalQA and the LLM.
269+
Now, set up the RAG system to handle queries
270270

271271
```python
272-
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
273-
llm = ChatOpenAI(
274-
base_url=os.getenv("SCW_INFERENCE_DEPLOYMENT_ENDPOINT"),
275-
api_key=os.getenv("SCW_API_KEY"),
276-
model=deployment.model_name,
272+
llm = ChatOpenAI(
273+
base_url=os.getenv("SCW_INFERENCE_DEPLOYMENT_ENDPOINT"),
274+
api_key=os.getenv("SCW_SECRET_KEY"),
275+
model=deployment.model_name,
277276
)
278277

279-
qa_stuff = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
278+
prompt = hub.pull("rlm/rag-prompt")
279+
retriever = vector_store.as_retriever()
280280

281-
query = "What are the commands to set up a database with the CLI of Scaleway?"
282-
response = qa_stuff.invoke(query)
283281

284-
print(response['result'])
282+
rag_chain = (
283+
{"context": retriever, "question": RunnablePassthrough()}
284+
| prompt
285+
| llm
286+
| StrOutputParser()
287+
)
288+
289+
for r in rag_chain.stream("Your question"):
290+
print(r, end="", flush=True)
291+
time.sleep(0.15)
285292
```
293+
- LLM Initialization: We initialize the ChatOpenAI instance using the endpoint and API key from the environment variables, along with the specified model name.
294+
295+
- Prompt Setup: The prompt is pulled from the hub using a pre-defined template, ensuring consistent query formatting.
296+
297+
- Retriever Configuration: We set up the retriever to access the vector store, allowing the RAG system to retrieve relevant information based on the query.
298+
299+
- RAG Chain Construction: We create the RAG chain, which connects the retriever, prompt, LLM, and output parser in a streamlined workflow.
300+
301+
- Query Execution: Finally, we stream the output of the RAG chain for a specified question, printing each response with a slight delay for better readability.
286302

303+
### Query the RAG system with you own prompt template
287304

288305
### Conclusion
289306

290-
This step is essential for efficiently processing and storing large document datasets for RAG. By using lazy loading, the system handles large datasets without overwhelming memory, while chunking ensures that each document is processed in a way that maximizes the performance of the LLM. The embeddings are stored in PostgreSQL via pgvector, allowing for fast and scalable retrieval when responding to user queries.
307+
In this tutorial, we explored essential techniques for efficiently processing and storing large document datasets for a Retrieval-Augmented Generation (RAG) system. By leveraging metadata, we can quickly check which documents have already been processed, ensuring that our system operates smoothly without redundant data handling. Chunking optimizes the processing of each document, maximizing the performance of the LLM. Storing embeddings in PostgreSQL via pgvector enables fast and scalable retrieval, ensuring quick responses to user queries.
291308

292-
By combining Scaleway’s Managed Object Storage, PostgreSQL with pgvector, and LangChain’s embedding tools, you can implement a powerful RAG system that scales with your data and offers robust information retrieval capabilities.
309+
By integrating Scaleway’s Managed Object Storage, PostgreSQL with pgvector, and LangChain’s embedding tools, you can build a powerful RAG system that scales with your data while offering robust information retrieval capabilities. This approach equips you with the tools necessary to handle complex queries and deliver accurate, relevant results efficiently.

0 commit comments

Comments
 (0)