Skip to content

Commit 3268ae9

Browse files
authored
Make llm-complete-guide work again (#164)
* fix url scraper * update requirements * fix outdated code * Update ZenML model version and fix vector store metadata access * Upgrade ZenML requirement to version 0.73.0 * Change default index type to Postgres in index generator * update constants * Add log_metadata import from ZenML in evaluation step * Suppress FutureWarning and refactor logging in eval and index steps * formatting * run evals in parallel * Add tenacity to requirements for improved retry handling * run tests in parallel * Add LLM-judged evaluation function for RAG tests Introduces a new function `run_llm_judged_tests` to perform end-to-end tests on RAG systems using LLM evaluation. The implementation includes: - Parallel processing of test cases - Scoring for toxicity, faithfulness, helpfulness, and relevance - Retry logic for robust test execution - Detailed logging of test results * Add metadata logging for comprehensive evaluation metrics Enhance the evaluation visualization step by logging detailed metrics to ZenML, including: - Retrieval performance metrics - Generation failure rates - Quality scores (toxicity, faithfulness, helpfulness, relevance) - Composite scores for overall quality and retrieval effectiveness * Clean up imports and remove unused imports in evaluation steps Refactor import statements in eval_retrieval.py and eval_visualisation.py to: - Remove unused imports - Organize imports consistently - Simplify import statements * Remove commented section in RAG configuration file Simplify the dev/rag.yaml configuration by removing the commented "environment configuration" line, keeping the configuration clean and concise. * Adjust default temperature for OpenAI model completion Modify the default temperature parameter in get_completion_from_messages() from 0.4 to 0, ensuring more deterministic and focused model responses. * make query via CLI work again * Update deployment command in README for simplified RAG pipeline deployment * Add type safety for ZenML secrets in Hugging Face deployment Modify Hugging Face space deployment to ensure ZenML store secrets are converted to strings before adding, preventing potential type-related errors during deployment. * Add Elasticsearch and Tenacity to project requirements Update project dependencies to include: - Elasticsearch for potential search and indexing functionality - Tenacity for improved retry handling in various components * Update ZenML chatbot model constants and improve vector store retrieval - Add explicit constants for ZenML chatbot model name and version - Enhance find_vectorstore_name() function with error handling and fallback mechanism - Improve logging for vector store metadata retrieval * Fix deployment :)
1 parent 3107915 commit 3268ae9

21 files changed

+807
-250
lines changed

llm-complete-guide/README.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ use for the LLM.
100100
When you're ready to make the query, run the following command:
101101

102102
```shell
103-
python run.py query "how do I use a custom materializer inside my own zenml steps? i.e. how do I set it? inside the @step decorator?" --model=gpt4
103+
python run.py query --query-text "how do I use a custom materializer inside my own zenml steps? i.e. how do I set it? inside the @step decorator?" --model=gpt4
104104
```
105105

106106
Alternative options for LLMs to use include:
@@ -147,13 +147,7 @@ export ZENML_HF_SPACE_NAME=<YOUR_HF_SPACE_NAME> # optional, defaults to "llm-com
147147
To deploy the RAG pipeline, you can use the following command:
148148

149149
```shell
150-
python run.py --deploy
151-
```
152-
153-
Alternatively, you can run the basic RAG pipeline *and* deploy it in one go:
154-
155-
```shell
156-
python run.py --rag --deploy
150+
python run.py deploy
157151
```
158152

159153
This will open a Hugging Face space in your browser where you can interact with

llm-complete-guide/configs/dev/rag.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
enable_cache: False
22

3-
# environment configuration
43
settings:
54
docker:
65
requirements:

llm-complete-guide/constants.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,16 @@
1717
import os
1818

1919
# Vector Store constants
20-
CHUNK_SIZE = 2000
20+
CHUNK_SIZE = 1000
2121
CHUNK_OVERLAP = 50
2222
EMBEDDING_DIMENSIONALITY = (
2323
384 # Update this to match the dimensionality of the new model
2424
)
2525

2626
# ZenML constants
2727
ZENML_CHATBOT_MODEL = "zenml-docs-qa-chatbot"
28+
ZENML_CHATBOT_MODEL_NAME = "zenml-docs-qa-chatbot"
29+
ZENML_CHATBOT_MODEL_VERSION = "0.71.0-dev"
2830

2931
# Scraping constants
3032
RATE_LIMIT = 5 # Maximum number of requests per second
@@ -35,8 +37,8 @@
3537
MODEL_NAME_MAP = {
3638
"gpt4": "gpt-4",
3739
"gpt35": "gpt-3.5-turbo",
38-
"claude3": "claude-3-opus-20240229",
39-
"claudehaiku": "claude-3-haiku-20240307",
40+
"claude3": "claude-3-5-sonnet-latest",
41+
"claudehaiku": "claude-3-5-haiku-latest",
4042
}
4143

4244
# CHUNKING_METHOD = "split-by-document"
Lines changed: 37 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,44 @@
1+
import logging
2+
13
import gradio as gr
4+
from constants import SECRET_NAME
25
from utils.llm_utils import process_input_with_retrieval
6+
from zenml.client import Client
37

8+
# Set up logging
9+
logging.basicConfig(level=logging.INFO)
10+
logger = logging.getLogger(__name__)
411

5-
def predict(message, history):
6-
return process_input_with_retrieval(
7-
input=message,
8-
n_items_retrieved=20,
9-
use_reranking=True,
12+
# Initialize ZenML client and verify secret access
13+
try:
14+
client = Client()
15+
secret = client.get_secret(SECRET_NAME)
16+
logger.info(
17+
f"Successfully initialized ZenML client and found secret {SECRET_NAME}"
1018
)
19+
except Exception as e:
20+
logger.error(f"Failed to initialize ZenML client or access secret: {e}")
21+
raise RuntimeError(f"Application startup failed: {e}")
22+
23+
24+
def predict(message, history):
25+
try:
26+
return process_input_with_retrieval(
27+
input=message,
28+
n_items_retrieved=20,
29+
use_reranking=True,
30+
)
31+
except Exception as e:
32+
logger.error(f"Error processing message: {e}")
33+
return f"Sorry, I encountered an error: {str(e)}"
34+
1135

36+
# Launch the Gradio interface
37+
interface = gr.ChatInterface(
38+
predict,
39+
title="ZenML Documentation Assistant",
40+
description="Ask me anything about ZenML!",
41+
)
1242

13-
gr.ChatInterface(predict, type="messages").launch()
43+
if __name__ == "__main__":
44+
interface.launch(server_name="0.0.0.0", share=False)

llm-complete-guide/gh_action_rag.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,10 @@
2121

2222
import click
2323
import yaml
24-
from zenml.enums import PluginSubType
25-
2624
from pipelines.llm_index_and_evaluate import llm_index_and_evaluate
27-
from zenml.client import Client
2825
from zenml import Model
29-
from zenml.exceptions import ZenKeyError
26+
from zenml.client import Client
27+
from zenml.enums import PluginSubType
3028

3129

3230
@click.command(
@@ -89,7 +87,7 @@ def main(
8987
zenml_model_name: Optional[str] = "zenml-docs-qa-rag",
9088
zenml_model_version: Optional[str] = None,
9189
):
92-
"""
90+
"""
9391
Executes the pipeline to train a basic RAG model.
9492
9593
Args:
@@ -108,14 +106,14 @@ def main(
108106
config = yaml.safe_load(file)
109107

110108
# Read the model version from a file in the root of the repo
111-
# called "ZENML_VERSION.txt".
109+
# called "ZENML_VERSION.txt".
112110
if zenml_model_version == "staging":
113111
postfix = "-rc0"
114112
elif zenml_model_version == "production":
115113
postfix = ""
116114
else:
117115
postfix = "-dev"
118-
116+
119117
if Path("ZENML_VERSION.txt").exists():
120118
with open("ZENML_VERSION.txt", "r") as file:
121119
zenml_model_version = file.read().strip()
@@ -177,7 +175,7 @@ def main(
177175
service_account_id=service_account_id,
178176
auth_window=0,
179177
flavor="builtin",
180-
action_type=PluginSubType.PIPELINE_RUN
178+
action_type=PluginSubType.PIPELINE_RUN,
181179
).id
182180
client.create_trigger(
183181
name="Production Trigger LLM-Complete",

llm-complete-guide/pipelines/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,5 +19,5 @@
1919
from pipelines.generate_chunk_questions import generate_chunk_questions
2020
from pipelines.llm_basic_rag import llm_basic_rag
2121
from pipelines.llm_eval import llm_eval
22+
from pipelines.llm_index_and_evaluate import llm_index_and_evaluate
2223
from pipelines.rag_deployment import rag_deployment
23-
from pipelines.llm_index_and_evaluate import llm_index_and_evaluate

llm-complete-guide/pipelines/finetune_embeddings.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
# or implied. See the License for the specific language governing
1313
# permissions and limitations under the License.
1414

15-
from constants import EMBEDDINGS_MODEL_NAME_ZENML
1615
from steps.finetune_embeddings import (
1716
evaluate_base_model,
1817
evaluate_finetuned_model,

llm-complete-guide/pipelines/llm_basic_rag.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
# See the License for the specific language governing permissions and
1515
# limitations under the License.
1616
#
17-
from litellm import config_path
1817

1918
from steps.populate_index import (
2019
generate_embeddings,

llm-complete-guide/pipelines/llm_index_and_evaluate.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,10 @@
1515
# limitations under the License.
1616
#
1717

18-
from pipelines import llm_basic_rag, llm_eval
1918
from zenml import pipeline
2019

20+
from pipelines import llm_basic_rag, llm_eval
21+
2122

2223
@pipeline
2324
def llm_index_and_evaluate() -> None:

llm-complete-guide/requirements.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
zenml[server]
1+
zenml[server]>=0.73.0
22
ratelimit
33
pgvector
44
psycopg2-binary
@@ -21,6 +21,7 @@ torch
2121
gradio
2222
huggingface-hub
2323
elasticsearch
24+
tenacity
2425

2526
# optional requirements for S3 artifact store
2627
# s3fs>2022.3.0

0 commit comments

Comments
 (0)