Commit 7458687

authored

feat: Add save/load functionality and improved repr for LLM-based metrics (#2320)

## Summary This PR adds persistence capabilities and better string representations for LLM-based metrics, making them easier to save, share, and debug. ## Changes ### 1. Save/Load Functionality - Added `save()` and `load()` methods to `SimpleLLMMetric` and its subclasses (`DiscreteMetric`, `NumericMetric`, `RankingMetric`) - Supports JSON format with optional gzip compression - Handles all prompt types including `Prompt` and `DynamicFewShotPrompt` - Smart defaults: `metric.save()` saves to `./metric_name.json` ### 2. Improved `__repr__` Methods - Clean, informative string representations for both LLM-based and decorator-based metrics - Removed implementation details (memory addresses, `<locals>`, internal attributes) - Smart prompt truncation (80 chars max) - Function signature display for decorator-based metrics **Before:** ```python create_metric_decorator.<locals>.decorator_factory.<locals>.decorator.<locals>.CustomMetric(name='summary_accuracy', _func=<function summary_accuracy at 0x151ffdf80>, ...) ``` **After:** ```python # LLM-based metrics DiscreteMetric(name='response_quality', allowed_values=['correct', 'incorrect'], prompt='Evaluate if the response...') # Decorator-based metrics summary_accuracy(user_input, response) -> DiscreteMetric[['pass', 'fail']] ``` ### 3. Response Model Handling - Added `create_auto_response_model()` factory to mark auto-generated models - Only warns about custom response models during save, not standard ones ## Usage Examples ```python # Save metric with default path metric.save() # → ./response_quality.json # Save with custom path metric.save("custom.json") metric.save("/path/to/metrics/") # → /path/to/metrics/response_quality.json metric.save("compressed.json.gz") # Compressed # Load metric loaded_metric = DiscreteMetric.load("response_quality.json") # For DynamicFewShotPrompt metrics loaded_metric = DiscreteMetric.load("metric.json", embedding_model=embeddings) ``` ## Testing - Comprehensive test suite with 8 tests covering all save/load scenarios - Tests for default paths, directory handling, compression - Tests for all prompt types and metric subclasses ## Dependencies **Note:** This PR builds on #2316 (Fix metric inheritance patterns) and requires it to be merged first. The changes here depend on the cleaned-up metric inheritance structure from that PR. ## Checklist - [x] Tests added - [x] Documentation in docstrings - [x] Backwards compatible (new functionality only) - [x] Follows TDD practices

1 parent 19caa7a commit 7458687Copy full SHA for 7458687

7 files changed

+904

-20

lines changed

src/ragas/metrics
tests/unit
- test_simple_llm_metric_persistence.py

7 files changed

+904

-20

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 7458687

7 files changed

7 files changed

File tree

7 files changed

7 files changed

0 commit comments