@@ -43,18 +43,23 @@ environment and install the dependencies using the following command:
4343pip install -r requirements.txt
4444```
4545
46+ Depending on your hardware you may run into some issues when running the ` pip install ` command with the
47+ ` flash_attn ` package. In that case running ` FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE pip install flash-attn --no-build-isolation `
48+ could help you.
49+
4650In order to use the default LLM for this query, you'll need an account and an
47- API key from OpenAI specified as another environment variable :
51+ API key from OpenAI specified as a ZenML secret :
4852
4953``` shell
50- export OPENAI_API_KEY=< your-openai-api-key>
54+ zenml secret create llm-complete --openai_api_key=< your-openai-api-key>
55+ export ZENML_PROJECT_SECRET_NAME=llm-complete
5156```
5257
5358### Setting up Supabase
5459
55- [ Supabase] ( https://supabase.com/ ) is a cloud provider that provides a PostgreSQL
60+ [ Supabase] ( https://supabase.com/ ) is a cloud provider that offers a PostgreSQL
5661database. It's simple to use and has a free tier that should be sufficient for
57- this project. Once you've created a Supabase account and organisation , you'll
62+ this project. Once you've created a Supabase account and organization , you'll
5863need to create a new project.
5964
6065![ ] ( .assets/supabase-create-project.png )
@@ -63,22 +68,15 @@ You'll want to save the Supabase database password as a ZenML secret so that it
6368isn't stored in plaintext. You can do this by running the following command:
6469
6570``` shell
66- zenml secret create supabase_postgres_db --password= " YOUR_PASSWORD"
71+ zenml secret update llm-complete -v ' {"supabase_password": " YOUR_PASSWORD", "supabase_user": "YOUR_USER", "supabase_host": "YOUR_HOST", "supabase_port": "YOUR_PORT"} '
6772```
6873
69- You'll then want to connect to this database instance by getting the connection
74+ You can get the user, host and port for this database instance by getting the connection
7075string from the Supabase dashboard.
7176
7277![ ] ( .assets/supabase-connection-string.png )
7378
74- You can use these details to populate some environment variables where the
75- pipeline code expects them:
76-
77- ``` shell
78- export ZENML_POSTGRES_USER=< your-supabase-user>
79- export ZENML_POSTGRES_HOST=< your-supabase-host>
80- export ZENML_POSTGRES_PORT=< your-supabase-port>
81- ```
79+ In case Supabase is not an option for you, you can use a different database as the backend.
8280
8381### Running the RAG pipeline
8482
@@ -116,6 +114,51 @@ Note that Claude will require a different API key from Anthropic. See [the
116114` litellm ` docs] ( https://docs.litellm.ai/docs/providers/anthropic ) on how to set
117115this up.
118116
117+ ### Deploying the RAG pipeline
118+
119+ ![ ] ( .assets/huggingface-space-rag-deployment.png )
120+
121+ You'll need to update and add some secrets to make this work with your Hugging
122+ Face account. To get your ZenML service account API token and store URL, you can
123+ first create a new service account:
124+
125+ ``` bash
126+ zenml service-account create < SERVICE_ACCOUNT_NAME>
127+ ```
128+
129+ For more information on this part of the process, please refer to the [ ZenML
130+ documentation] ( https://docs.zenml.io/how-to/project-setup-and-management/connecting-to-zenml/connect-with-a-service-account ) .
131+
132+ Once you have your service account API token and store URL (the URL of your
133+ deployed ZenML tenant), you can update the secrets with the following command:
134+
135+ ``` bash
136+ zenml secret update llm-complete --zenml_api_token=< YOUR_ZENML_SERVICE_ACCOUNT_API_TOKEN> --zenml_store_url=< YOUR_ZENML_STORE_URL>
137+ ```
138+
139+ To set the Hugging Face user space that gets used for the Gradio app deployment,
140+ you should set an environment variable with the following command:
141+
142+ ``` bash
143+ export ZENML_HF_USERNAME=< YOUR_HF_USERNAME>
144+ export ZENML_HF_SPACE_NAME=< YOUR_HF_SPACE_NAME> # optional, defaults to "llm-complete-guide-rag"
145+ ```
146+
147+ To deploy the RAG pipeline, you can use the following command:
148+
149+ ``` shell
150+ python run.py --deploy
151+ ```
152+
153+ Alternatively, you can run the basic RAG pipeline * and* deploy it in one go:
154+
155+ ``` shell
156+ python run.py --rag --deploy
157+ ```
158+
159+ This will open a Hugging Face space in your browser where you can interact with
160+ the RAG pipeline.
161+
119162### Run the LLM RAG evaluation pipeline
120163
121164To run the evaluation pipeline, you can use the following command:
@@ -151,16 +194,16 @@ documentation](https://docs.zenml.io/v/docs/stack-components/annotators/argilla)
151194will guide you through the process of connecting to your instance as a stack
152195component.
153196
154- ### Finetune the embeddings
197+ Please use the secret from above to track all the secrets. Here we are also
198+ setting a Huggingface write key. In order to make the rest of the pipeline work for you, you
199+ will need to change the hf repo urls to a space you have permissions to.
155200
156- To run the pipeline for finetuning the embeddings, you can use the following
157- commands:
158-
159- ``` shell
160- pip install -r requirements-argilla.txt # special requirements
161- python run.py --embeddings
201+ ``` bash
202+ zenml secret update llm-complete -v ' {"argilla_api_key": "YOUR_ARGILLA_API_KEY", "argilla_api_url": "YOUR_ARGILLA_API_URL", "hf_token": "YOUR_HF_TOKEN"}'
162203```
163204
205+ ### Finetune the embeddings
206+
164207As with the previous pipeline, you will need to have set up and connected to an Argilla instance for this
165208to work. Please follow the instructions in the [ Argilla
166209documentation] ( https://docs.argilla.io/latest/getting_started/quickstart/ )
@@ -170,6 +213,17 @@ documentation](https://docs.zenml.io/v/docs/stack-components/annotators/argilla)
170213will guide you through the process of connecting to your instance as a stack
171214component.
172215
216+ The pipeline assumes that your argilla secret is stored within a ZenML secret called ` argilla_secrets ` .
217+ ![ Argilla Secret] ( .assets/argilla_secret.png )
218+
219+ To run the pipeline for finetuning the embeddings, you can use the following
220+ commands:
221+
222+ ``` shell
223+ pip install -r requirements-argilla.txt # special requirements
224+ python run.py --embeddings
225+ ```
226+
173227* Credit to Phil Schmid for his [ tutorial on embeddings finetuning with Matryoshka
174228loss function] ( https://www.philschmid.de/fine-tune-embedding-model-for-rag ) which we adapted for this project.*
175229
0 commit comments