You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 20, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: docs/guides/python/llama-rag.mdx
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -76,7 +76,7 @@ We'll organize our project structure like so:
76
76
77
77
## Setting up our LLM
78
78
79
-
We'll define a `ModelParameters` class which will have parameters used throughout our application. By putting it in a class, it means it will lazily load the llm, [embed model](https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings/), and tokenizer so that it doesn't slow down other modules that don't require everything to be initialised. At this point we can also create a prompt template for prompts with our query engine. It will just sanitize some of the hallucinations so that if the model does not know an answer it won't pretend like it does. We'll also define two functions that will convert a prompt or message into the required Llama 3.1 format.
79
+
We'll define a `ModelParameters` class which will have parameters used throughout our application. By putting it in a class, it means it will lazily load the LLM, [embed model](https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings/), and tokenizer so that it doesn't slow down other modules that don't require everything to be initialized. At this point we can also create a prompt template for prompts with our query engine. It will just sanitize some of the hallucinations so that if the model does not know an answer it won't pretend like it does. We'll also define two functions that will convert a prompt or message into the required Llama 3.1 format.
80
80
81
81
```python title:common/model_parameters.py
82
82
import os
@@ -192,7 +192,7 @@ The next step is where we embed our context into the LLM. For this example we wi
We'll create a script which will download the [LLM](https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF), the embed model (using a recommended [model](https://huggingface.co/BAAI/bge-large-en-v1.5) from Hugging Face), and create the vectorised documentation using the embed model.
195
+
We'll create a script which will download the [LLM](https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF), the embed model (using a recommended [model](https://huggingface.co/BAAI/bge-large-en-v1.5) from Hugging Face), and convert the documentation into a vector model using the embed model.
196
196
197
197
```python title:model_utilities.py
198
198
import os
@@ -262,7 +262,7 @@ download_llm()
262
262
build_query_engine()
263
263
```
264
264
265
-
You can then run this using the following command. This should output the models and the vectorised documentation into the `./models` folder.
265
+
You can then run the script using the following command. This should output the models and the vector model into the `./models` folder.
266
266
267
267
```bash
268
268
uv run model_utilities.py
@@ -456,7 +456,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
456
456
uv sync --extra ml --frozen --no-dev --no-python-downloads
457
457
```
458
458
459
-
To ensure an optimised docker image, update the `python.dockerfile.dockerignore` to include the models folder.
459
+
To ensure an optimized docker image, update the `python.dockerfile.dockerignore` to include the models folder.
0 commit comments