Skip to content

Commit 38d6f90

Browse files
LLM Complete Guide update (#133)
* RAG working again * Further small changes * Adjusted Readme * Further small improvements * Rag -> Embeddings running * Implemented basic gitflow * Update llm-complete-guide/README.md Co-authored-by: Alex Strick van Linschoten <[email protected]> * Removed outdated flags * update requirements and constants * remove dummy pipeline * update gitignore * Moved secrets into ZenML secrets * Furhter changes for remote execution * Fixed some configs * add gradio to requirements * add rag deployment * rag deployment addition * fix in get_db_port * Single secret for all * Updated README to new secret * Added default * Updated github actions * Updated Readme * formatting * moar formatting * notebooks need formatting too * add gradio temp folder to gitignore * Updated requirements. --------- Co-authored-by: Alex Strick van Linschoten <[email protected]> Co-authored-by: Alex Strick van Linschoten <[email protected]>
1 parent cbe263d commit 38d6f90

36 files changed

+548
-295
lines changed
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
name: Staging Trigger LLM-COMPLETE
2+
on:
3+
pull_request:
4+
types: [opened, synchronize]
5+
branches: [staging, main]
6+
concurrency:
7+
# New commit on branch cancels running workflows of the same branch
8+
group: ${{ github.workflow }}-${{ github.ref }}
9+
cancel-in-progress: true
10+
11+
jobs:
12+
run-staging-workflow:
13+
runs-on: ubuntu-dind-runners
14+
env:
15+
ZENML_HOST: ${{ secrets.ZENML_HOST }}
16+
ZENML_API_KEY: ${{ secrets.ZENML_API_KEY }}
17+
ZENML_STAGING_STACK: 51a49786-b82a-4646-bde7-a460efb0a9c5
18+
ZENML_GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
19+
ZENML_GITHUB_URL_PR: ${{ github.event.pull_request._links.html.href }}
20+
ZENML_DEBUG: true
21+
ZENML_ANALYTICS_OPT_IN: false
22+
ZENML_LOGGING_VERBOSITY: INFO
23+
ZENML_PROJECT_SECRET_NAME: llm-complete
24+
25+
steps:
26+
- name: Check out repository code
27+
uses: actions/checkout@v3
28+
29+
- uses: actions/setup-python@v4
30+
with:
31+
python-version: '3.11'
32+
33+
- name: Install requirements
34+
run: |
35+
pip3 install -r requirements.txt
36+
zenml integration install gcp -y
37+
38+
- name: Connect to ZenML server
39+
run: |
40+
zenml connect --url $ZENML_HOST --api-key $ZENML_API_KEY
41+
42+
- name: Set stack (Staging)
43+
if: ${{ github.base_ref == 'staging' }}
44+
run: |
45+
zenml stack set ${{ env.ZENML_STAGING_STACK }}
46+
47+
- name: Run pipeline (Staging)
48+
if: ${{ github.base_ref == 'staging' }}
49+
run: |
50+
python run.py --rag --evaluation --no-cache

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,8 @@ llm-lora-finetuning/configs/shopify.yaml
162162
finetuned-matryoshka/
163163
finetuned-all-MiniLM-L6-v2/
164164
finetuned-snowflake-arctic-embed-m/
165+
finetuned-snowflake-arctic-embed-m-v1.5/
166+
.gradio/
165167

166168
# ollama ignores
167169
nohup.out
37.7 KB
Loading

llm-complete-guide/README.md

Lines changed: 29 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -43,11 +43,16 @@ environment and install the dependencies using the following command:
4343
pip install -r requirements.txt
4444
```
4545

46+
Depending on your hardware you may run into some issues when running the `pip install` command with the
47+
`flash_attn` package. In that case running `FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE pip install flash-attn --no-build-isolation`
48+
could help you.
49+
4650
In order to use the default LLM for this query, you'll need an account and an
47-
API key from OpenAI specified as another environment variable:
51+
API key from OpenAI specified as a ZenML secret:
4852

4953
```shell
50-
export OPENAI_API_KEY=<your-openai-api-key>
54+
zenml secret create llm-complete --openai_api_key=<your-openai-api-key>
55+
export ZENML_PROJECT_SECRET_NAME=llm-complete
5156
```
5257

5358
### Setting up Supabase
@@ -63,22 +68,15 @@ You'll want to save the Supabase database password as a ZenML secret so that it
6368
isn't stored in plaintext. You can do this by running the following command:
6469

6570
```shell
66-
zenml secret create supabase_postgres_db --password="YOUR_PASSWORD"
71+
zenml secret update llm-complete -v '{"supabase_password": "YOUR_PASSWORD", "supabase_user": "YOUR_USER", "supabase_host": "YOUR_HOST", "supabase_port": "YOUR_PORT"}'
6772
```
6873

69-
You'll then want to connect to this database instance by getting the connection
74+
You can get the user, host and port for this database instance by getting the connection
7075
string from the Supabase dashboard.
7176

7277
![](.assets/supabase-connection-string.png)
7378

74-
You can use these details to populate some environment variables where the
75-
pipeline code expects them:
76-
77-
```shell
78-
export ZENML_POSTGRES_USER=<your-supabase-user>
79-
export ZENML_POSTGRES_HOST=<your-supabase-host>
80-
export ZENML_POSTGRES_PORT=<your-supabase-port>
81-
```
79+
In case supabase is not an option for you, you can use a different database as the backend.
8280

8381
### Running the RAG pipeline
8482

@@ -151,16 +149,17 @@ documentation](https://docs.zenml.io/v/docs/stack-components/annotators/argilla)
151149
will guide you through the process of connecting to your instance as a stack
152150
component.
153151

154-
### Finetune the embeddings
155-
156-
To run the pipeline for finetuning the embeddings, you can use the following
157-
commands:
152+
Please use the secret from above to track all the secrets. Here we are also
153+
setting a Huggingface write key. In order to make the rest of the pipeline work for you, you
154+
will need to change the hf repo urls to a space you have permissions to.
158155

159-
```shell
160-
pip install -r requirements-argilla.txt # special requirements
161-
python run.py --embeddings
156+
```bash
157+
zenml secret update llm-complete -v '{"argilla_api_key": "YOUR_ARGILLA_API_KEY", "argilla_api_url": "YOUR_ARGILLA_API_URL", "hf_token": "YOUR_HF_TOKEN"}'
162158
```
163159

160+
161+
### Finetune the embeddings
162+
164163
As with the previous pipeline, you will need to have set up and connected to an Argilla instance for this
165164
to work. Please follow the instructions in the [Argilla
166165
documentation](https://docs.argilla.io/latest/getting_started/quickstart/)
@@ -170,6 +169,17 @@ documentation](https://docs.zenml.io/v/docs/stack-components/annotators/argilla)
170169
will guide you through the process of connecting to your instance as a stack
171170
component.
172171

172+
The pipeline assumes that your argilla secret is stored within a ZenML secret called `argilla_secrets`.
173+
![Argilla Secret](.assets/argilla_secret.png)
174+
175+
To run the pipeline for finetuning the embeddings, you can use the following
176+
commands:
177+
178+
```shell
179+
pip install -r requirements-argilla.txt # special requirements
180+
python run.py --embeddings
181+
```
182+
173183
*Credit to Phil Schmid for his [tutorial on embeddings finetuning with Matryoshka
174184
loss function](https://www.philschmid.de/fine-tune-embedding-model-for-rag) which we adapted for this project.*
175185

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# enable_cache: False
2+
3+
# environment configuration
4+
settings:
5+
docker:
6+
parent_image: "zenmldocker/prepare-release:base-0.68.0"
7+
requirements:
8+
- langchain-community
9+
- ratelimit
10+
- langchain>=0.0.325
11+
- langchain-openai
12+
- pgvector
13+
- psycopg2-binary
14+
- beautifulsoup4
15+
- unstructured
16+
- pandas
17+
- numpy
18+
- sentence-transformers>=3
19+
- transformers[torch]
20+
- litellm
21+
- ollama
22+
- tiktoken
23+
- umap-learn
24+
- matplotlib
25+
- pyarrow
26+
- rerankers[flashrank]
27+
- datasets
28+
- torch
29+
environment:
30+
ZENML_PROJECT_SECRET_NAME: llm_complete
31+
32+
33+
# configuration of the Model Control Plane
34+
model:
35+
name: finetuned-zenml-docs-embeddings
36+
version: latest
37+
license: Apache 2.0
38+
description: Finetuned LLM on ZenML docs
39+
tags: ["rag", "finetuned"]
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
enable_cache: False
2+
3+
# environment configuration
4+
settings:
5+
docker:
6+
requirements:
7+
- unstructured
8+
- sentence-transformers>=3
9+
- pgvector
10+
- datasets
11+
- litellm
12+
- numpy
13+
- psycopg2-binary
14+
- tiktoken
15+
16+
# configuration of the Model Control Plane
17+
model:
18+
name: finetuned-zenml-docs-embeddings
19+
license: Apache 2.0
20+
description: Finetuned LLM on ZenML docs
21+
tags: ["rag", "finetuned"]
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# environment configuration
2+
settings:
3+
docker:
4+
requirements:
5+
- unstructured
6+
- sentence-transformers>=3
7+
- pgvector
8+
- datasets
9+
- litellm
10+
- numpy
11+
- psycopg2-binary
12+
- tiktoken
13+
- ratelimit
14+
environment:
15+
ZENML_PROJECT_SECRET_NAME: llm_complete
16+
ZENML_ENABLE_RICH_TRACEBACK: FALSE
17+
ZENML_LOGGING_VERBOSITY: INFO
18+
19+
steps:
20+
url_scraper:
21+
parameters:
22+
docs_url: https://docs.zenml.io
23+
generate_embeddings:
24+
step_operator: "terraform-gcp-6c0fd52233ca"
25+
settings:
26+
step_operator.vertex:
27+
accelerator_type: "NVIDIA_TESLA_P100"
28+
accelerator_count: 1
29+
machine_type: "n1-standard-8"
30+
31+
# configuration of the Model Control Plane
32+
model:
33+
name: finetuned-zenml-docs-embeddings
34+
license: Apache 2.0
35+
description: Finetuned LLM on ZenML docs
36+
tags: ["rag", "finetuned"]
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
enable_cache: False
2+
3+
# environment configuration
4+
settings:
5+
docker:
6+
requirements:
7+
- unstructured
8+
- sentence-transformers>=3
9+
- pgvector
10+
- datasets
11+
- litellm
12+
- numpy
13+
- psycopg2-binary
14+
- tiktoken
15+
- ratelimit
16+
environment:
17+
ZENML_PROJECT_SECRET_NAME: llm_complete
18+
ZENML_ENABLE_RICH_TRACEBACK: FALSE
19+
ZENML_LOGGING_VERBOSITY: INFO
20+
21+
22+
# configuration of the Model Control Plane
23+
model:
24+
name: finetuned-zenml-docs-embeddings
25+
license: Apache 2.0
26+
description: Finetuned LLM on ZenML docs
27+
tags: ["rag", "finetuned"]
28+
29+
steps:
30+
url_scraper:
31+
parameters:
32+
docs_url: https://docs.zenml.io/stack-components/orchestrators
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# enable_cache: False
2+
3+
# environment configuration
4+
settings:
5+
docker:
6+
requirements:
7+
- langchain-community
8+
- ratelimit
9+
- langchain>=0.0.325
10+
- langchain-openai
11+
- pgvector
12+
- psycopg2-binary
13+
- beautifulsoup4
14+
- unstructured
15+
- pandas
16+
- numpy
17+
- sentence-transformers>=3
18+
- transformers
19+
- litellm
20+
- ollama
21+
- tiktoken
22+
- umap-learn
23+
- matplotlib
24+
- pyarrow
25+
- rerankers[flashrank]
26+
- datasets
27+
- torch
28+
- distilabel
29+
environment:
30+
ZENML_PROJECT_SECRET_NAME: llm_complete
31+
32+
33+
# configuration of the Model Control Plane
34+
model:
35+
name: finetuned-zenml-docs-embeddings
36+
version: latest
37+
license: Apache 2.0
38+
description: Finetuned LLM on ZenML docs
39+
tags: ["rag", "finetuned"]

llm-complete-guide/constants.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
# See the License for the specific language governing permissions and
1515
# limitations under the License.
1616
#
17+
import os
1718

1819
# Vector Store constants
1920
CHUNK_SIZE = 2000
@@ -57,20 +58,23 @@
5758

5859
# embeddings finetuning constants
5960
EMBEDDINGS_MODEL_NAME_ZENML = "finetuned-zenml-docs-embeddings"
60-
DATASET_NAME_DEFAULT = "zenml/rag_qa_embedding_questions_0_60_0"
61+
# DATASET_NAME_DEFAULT = "zenml/rag_qa_embedding_questions_0_60_0"
62+
DATASET_NAME_DEFAULT = "zenml/rag_qa_embedding_questions"
6163
DATASET_NAME_DISTILABEL = f"{DATASET_NAME_DEFAULT}_distilabel"
6264
DATASET_NAME_ARGILLA = DATASET_NAME_DEFAULT.replace("zenml/", "")
6365
OPENAI_MODEL_GEN = "gpt-4o"
6466
OPENAI_MODEL_GEN_KWARGS_EMBEDDINGS = {
6567
"temperature": 0.7,
6668
"max_new_tokens": 512,
6769
}
68-
EMBEDDINGS_MODEL_ID_BASELINE = "Snowflake/snowflake-arctic-embed-m"
69-
EMBEDDINGS_MODEL_ID_FINE_TUNED = "finetuned-snowflake-arctic-embed-m"
70+
EMBEDDINGS_MODEL_ID_BASELINE = "Snowflake/snowflake-arctic-embed-m-v1.5"
71+
EMBEDDINGS_MODEL_ID_FINE_TUNED = "finetuned-snowflake-arctic-embed-m-v1.5"
7072
EMBEDDINGS_MODEL_MATRYOSHKA_DIMS: list[int] = [
7173
384,
7274
256,
7375
128,
7476
64,
7577
] # Important: large to small
7678
USE_ARGILLA_ANNOTATIONS = False
79+
80+
SECRET_NAME = os.getenv("ZENML_PROJECT_SECRET_NAME", "llm-complete")

0 commit comments

Comments
 (0)