You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that configuration should be similar to the one used in `embed.py` to ensure vectors will be read in the same format as the one used to create and store them.
242
+
</Message>
228
243
229
-
prompt = hub.pull("rlm/rag-prompt")
230
-
retriever = vector_store.as_retriever()
231
244
245
+
### Configure LLM client and create a basic RAG pipeline
3. Edit `rag.py` to configure LLM client using `ChatOpenAI` and create a simple RAG pipeline:
239
248
240
-
for r in rag_chain.stream("Your question"):
241
-
print(r, end="", flush=True)
242
-
time.sleep(0.1)
243
-
```
244
-
- LLM initialization: we initialize the ChatOpenAI instance using the endpoint and API key from the environment variables, along with the specified model name.
for r in rag_chain.stream("Provide the CLI command to shut down a scaleway instance. Its instance-uuid is example-28f3-4e91-b2af-4c3502562d72"):
269
+
print(r, end="", flush=True)
270
+
```
271
+
-`hub.pull("rlm/rag-prompt")` uses a standard RAG template, ensuring documents content retrieved will be passed as proper context along your prompt to the LLM.
272
+
-`vector_store.as_retriever()` configure your vector store as additional context to collect based on your prompt.
273
+
-`rag_chain` defines a workflow performing context retrieving, LLM prompting and finall pasing output in a streamlined way.
274
+
-`for r in rag_chain.stream("Prompt question")` defines a workflow performing context retrieving, LLM prompting and finall pasing output in a streamlined way.
245
275
246
-
- Prompt setup: the prompt is pulled from the hub using a predefined template, ensuring consistent query formatting.
276
+
4. You can now execute your RAG pipeline with:
247
277
248
-
- Retriever configuration: we set up the retriever to access the vector store, allowing the RAG system to retrieve relevant information based on the query.
278
+
```sh
279
+
python rag.py
280
+
```
249
281
250
-
- RAG chain construction: we create the RAG chain, which connects the retriever, prompt, LLM, and output parser in a streamlined workflow.
282
+
If you used the Scaleway cheatsheet provided as examples, and asked for a CLI command to power of instance, you should see the following answer:
283
+
```sh
284
+
scw instance server stop example-28f3-4e91-b2af-4c3502562d72
251
285
252
-
- Query execution: finally, we stream the output of the RAG chain for a specified question, printing each response with a slight delay for better readability.
286
+
This will shut down the instance with the specified instance-uuid.
287
+
Please note that this command only stops the instance, it doesn't shut it down completely
288
+
```
289
+
This command is fully correct and can be used with Scaleway CLI. Note especially that vector embedding enabled the system to retrieve proper document chunks even if the Scaleway cheatsheet doesn't mention `shutdown` but only `power off`.
290
+
You can compare this result without RAG (for instance using [Generative APIs Playground](https://console.scaleway.com/generative-api/models/fr-par/playground?modelName=llama-3.1-8b-instruct)):
This is command is not correct at all, and hallucinate in several ways to fit the question prompt content: `scaleway` instead of `scw`, `instance` instead of `instance server`, `shutdown` instead of `stop` and `--instance-uuid` parameter doesn't exist.
253
295
254
296
### Query the RAG system with your own prompt template
for r in custom_rag_chain.stream({"question":"your question", "context": context}):
279
321
print(r, end="", flush=True)
280
-
time.sleep(0.1)
281
322
```
282
323
283
324
- Prompt template: the prompt template is meticulously crafted to direct the model's responses. It clearly instructs the model on how to leverage the provided context and emphasizes the importance of honesty in cases where it lacks information.
0 commit comments