You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You do not need to install pgvector manually using `CREATE EXTENSION vector` as Langchain will automatically detect it is not present and install it when calling adapter `PGVector`.
120
-
</Message>
118
+
<Messagetype="tip">
119
+
You do not need to install pgvector manually using `CREATE EXTENSION vector` as Langchain will automatically detect it is not present and install it when calling adapter `PGVector`.
120
+
</Message>
121
121
122
122
## Load and process documents
123
123
@@ -133,28 +133,11 @@ Then, we will embed them as vectors and store these vectors in your PostgreSQL d
133
133
```python
134
134
from langchain_community.document_loaders import S3DirectoryLoader
135
135
from langchain.text_splitter import RecursiveCharacterTextSplitter
136
-
137
-
```
138
-
139
-
### Configure S3 client and list objects from bucket
9. Edit `embed.py` to load all files in your bucket using `S3DirectoryLoader`, split them into chunks of 500 characters using `RecursiveCharacterTextSplitter` and embed them and store them into your PostgreSQL database using `PGVector`.
140
+
8. Edit `embed.py` to load all files in your bucket using `S3DirectoryLoader`, split them into chunks of 500 characters using `RecursiveCharacterTextSplitter` and embed them and store them into your PostgreSQL database using `PGVector`.
@@ -175,7 +158,7 @@ Then, we will embed them as vectors and store these vectors in your PostgreSQL d
175
158
176
159
The chunk size of 500 characters is chosen to fit within the context size limit of the embedding model used in this tutorial, but could be raised up to 4096 characters for `bge-multilingual-gemma2` model (or slightly more as context size is counted in tokens). Keeping chunks small also optimize performance during inference.
177
160
178
-
10. You can now run you vector embedding script with:
161
+
9. You can now run you vector embedding script with:
179
162
180
163
```sh
181
164
python embed.py
@@ -255,10 +238,10 @@ Then, we will embed them as vectors and store these vectors in your PostgreSQL d
255
238
for r in rag_chain.stream("Provide the CLI command to shut down a scaleway instance. Its instance-uuid is example-28f3-4e91-b2af-4c3502562d72"):
256
239
print(r, end="", flush=True)
257
240
```
258
-
-`hub.pull("rlm/rag-prompt")` uses a standard RAG template, ensuring documents content retrieved will be passed as proper context along your prompt to the LLM.
259
-
-`vector_store.as_retriever()`configure your vector store as additional context to collect based on your prompt.
260
-
-`rag_chain` defines a workflow performing context retrieving, LLM prompting and finall pasing output in a streamlined way.
261
-
-`for r in rag_chain.stream("Prompt question")`defines a workflow performing context retrieving, LLM prompting and finall pasing output in a streamlined way.
241
+
-`hub.pull("rlm/rag-prompt")` uses a standard RAG template. This ensures documents content retrieved will be passed as context along your prompt to the LLM using a compatible format.
242
+
-`vector_store.as_retriever()`configures your vector store as additional context to retrieve before calling the LLM.
243
+
-`rag_chain` defines a workflow performing the following steps in order: Retrieve relevant documents, Prompt LLM with document as context, and final output parsing.
244
+
-`for r in rag_chain.stream("Prompt question")`starts the rag workflow with `Prompt question` as input.
262
245
263
246
4. You can now execute your RAG pipeline with:
264
247
@@ -273,8 +256,8 @@ Then, we will embed them as vectors and store these vectors in your PostgreSQL d
273
256
This will shut down the instance with the specified instance-uuid.
274
257
Please note that this command only stops the instance, it doesn't shut it down completely
275
258
```
276
-
This command is fully correct and can be used with Scaleway CLI. Note especially that vector embedding enabled the system to retrieve proper document chunks even if the Scaleway cheatsheet doesn't mention `shutdown` but only `power off`.
277
-
You can compare this result without RAG (for instance using [Generative APIs Playground](https://console.scaleway.com/generative-api/models/fr-par/playground?modelName=llama-3.1-8b-instruct)):
259
+
This command is fully correct and can be used with Scaleway CLI. Note especially that vector embedding enabled the system to retrieve proper document chunks even if the Scaleway cheatsheet never mention `shut down` but only `power off`.
260
+
You can compare this result without RAG (for instance by using the same prompt in [Generative APIs Playground](https://console.scaleway.com/generative-api/models/fr-par/playground?modelName=llama-3.1-8b-instruct)):
@@ -330,11 +313,19 @@ Personalizing your prompt template allows you to tailor the responses from your
330
313
print(r, end="", flush=True)
331
314
```
332
315
333
-
- `PromptTemplate` enable you to customize how retrieved context and question are passed through LLM prompt.
334
-
- `retriever.invoke` let you customize which part of the LLM input is used to retrieve context.
335
-
- `create_stuff_documents_chain`
316
+
- `PromptTemplate` enables you to customize how retrieved context and question are passed through LLM prompt.
317
+
- `retriever.invoke` lets you customize which part of the LLM input is used to retrieve documents.
318
+
- `create_stuff_documents_chain` provides the prompt template to the llm
319
+
320
+
6. You can now execute your custom RAG pipeline with:
321
+
322
+
```sh
323
+
python rag.py
324
+
```
325
+
326
+
Note that with Scaleway cheatsheets example, the CLI answer should be similar, but without additional explanations regarding the command line performed.
336
327
337
-
Congratulations! You built a custom RAG pipeline to improve LLM answers based on specific documentation.
328
+
Congratulations! You built a custom RAG pipeline to improve LLM answers based on specific documentation.
338
329
339
330
You can now go further by:
340
331
- Specializing your RAG pipeline for your use case (whether it's providing better answers for Customer support, finding relevant content through Internal Documentation, helping user generate more creative and personalized content, or much more)
0 commit comments