Make `llm-complete-guide` work again #164

strickvl · 2025-01-31T16:14:25Z

Quite a few small fixes.

Also made Postgresql the default DB again.

And parallelise the evals.

Introduces a new function `run_llm_judged_tests` to perform end-to-end tests on RAG systems using LLM evaluation. The implementation includes: - Parallel processing of test cases - Scoring for toxicity, faithfulness, helpfulness, and relevance - Retry logic for robust test execution - Detailed logging of test results

Enhance the evaluation visualization step by logging detailed metrics to ZenML, including: - Retrieval performance metrics - Generation failure rates - Quality scores (toxicity, faithfulness, helpfulness, relevance) - Composite scores for overall quality and retrieval effectiveness

Refactor import statements in eval_retrieval.py and eval_visualisation.py to: - Remove unused imports - Organize imports consistently - Simplify import statements

Simplify the dev/rag.yaml configuration by removing the commented "environment configuration" line, keeping the configuration clean and concise.

Modify the default temperature parameter in get_completion_from_messages() from 0.4 to 0, ensuring more deterministic and focused model responses.

…yment

Modify Hugging Face space deployment to ensure ZenML store secrets are converted to strings before adding, preventing potential type-related errors during deployment.

htahir1

postgres is a bit hard locally - is there a way we can change this to a leaner DB? even something like sqlite?

strickvl · 2025-01-31T16:19:24Z

postgres is a bit hard locally - is there a way we can change this to a leaner DB? even something like sqlite?

I'll see what we can do. Problem is that sqlite doesn't really support vector search etc like Postgres does. But maybe I can refactor that out in a separate PR?

strickvl · 2025-01-31T16:22:17Z

I guess I could try https://alexgarcia.xyz/sqlite-vec/ but I'd like to do it as a completely self-contained PR, not in this one please. WDYT @htahir1 ?

Update project dependencies to include: - Elasticsearch for potential search and indexing functionality - Tenacity for improved retry handling in various components

htahir1 · 2025-01-31T16:33:04Z

Yes @strickvl lets just merge this one so its fixed on main

- Add explicit constants for ZenML chatbot model name and version - Enhance find_vectorstore_name() function with error handling and fallback mechanism - Improve logging for vector store metadata retrieval

strickvl added 21 commits January 31, 2025 13:50

fix url scraper

49f02a3

update requirements

c51ab4a

fix outdated code

0bec7d4

Update ZenML model version and fix vector store metadata access

a7105fb

Upgrade ZenML requirement to version 0.73.0

22291fd

Change default index type to Postgres in index generator

5b392fe

update constants

7b23505

Add log_metadata import from ZenML in evaluation step

38a5b7c

Suppress FutureWarning and refactor logging in eval and index steps

f66335e

formatting

85fb182

run evals in parallel

4261dc5

Add tenacity to requirements for improved retry handling

90907f3

run tests in parallel

b82f6cf

Clean up imports and remove unused imports in evaluation steps

7100c07

Refactor import statements in eval_retrieval.py and eval_visualisation.py to: - Remove unused imports - Organize imports consistently - Simplify import statements

Remove commented section in RAG configuration file

31403b2

Simplify the dev/rag.yaml configuration by removing the commented "environment configuration" line, keeping the configuration clean and concise.

Adjust default temperature for OpenAI model completion

35f6634

Modify the default temperature parameter in get_completion_from_messages() from 0.4 to 0, ensuring more deterministic and focused model responses.

make query via CLI work again

af301c6

Update deployment command in README for simplified RAG pipeline deplo…

da289e2

…yment

Add type safety for ZenML secrets in Hugging Face deployment

6759c4a

Modify Hugging Face space deployment to ensure ZenML store secrets are converted to strings before adding, preventing potential type-related errors during deployment.

strickvl added bug Something isn't working internal labels Jan 31, 2025

strickvl requested review from htahir1 and wjayesh January 31, 2025 16:14

htahir1 requested changes Jan 31, 2025

View reviewed changes

Add Elasticsearch and Tenacity to project requirements

ca00f7f

Update project dependencies to include: - Elasticsearch for potential search and indexing functionality - Tenacity for improved retry handling in various components

strickvl added 2 commits January 31, 2025 17:54

Update ZenML chatbot model constants and improve vector store retrieval

fff3b36

- Add explicit constants for ZenML chatbot model name and version - Enhance find_vectorstore_name() function with error handling and fallback mechanism - Improve logging for vector store metadata retrieval

Fix deployment :)

bf2270d

strickvl merged commit 3268ae9 into main Jan 31, 2025
1 check passed

strickvl deleted the feature/eval-loop branch January 31, 2025 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make `llm-complete-guide` work again #164

Make `llm-complete-guide` work again #164

Uh oh!

strickvl commented Jan 31, 2025

Uh oh!

htahir1 left a comment

Uh oh!

strickvl commented Jan 31, 2025

Uh oh!

strickvl commented Jan 31, 2025

Uh oh!

htahir1 commented Jan 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Make llm-complete-guide work again #164

Make llm-complete-guide work again #164

Uh oh!

Conversation

strickvl commented Jan 31, 2025

Uh oh!

htahir1 left a comment

Choose a reason for hiding this comment

Uh oh!

strickvl commented Jan 31, 2025

Uh oh!

strickvl commented Jan 31, 2025

Uh oh!

htahir1 commented Jan 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Make `llm-complete-guide` work again #164

Make `llm-complete-guide` work again #164