-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit 7458687
authored
feat: Add save/load functionality and improved repr for LLM-based metrics (#2320)
## Summary
This PR adds persistence capabilities and better string representations
for LLM-based metrics, making them easier to save, share, and debug.
## Changes
### 1. Save/Load Functionality
- Added `save()` and `load()` methods to `SimpleLLMMetric` and its
subclasses (`DiscreteMetric`, `NumericMetric`, `RankingMetric`)
- Supports JSON format with optional gzip compression
- Handles all prompt types including `Prompt` and `DynamicFewShotPrompt`
- Smart defaults: `metric.save()` saves to `./metric_name.json`
### 2. Improved `__repr__` Methods
- Clean, informative string representations for both LLM-based and
decorator-based metrics
- Removed implementation details (memory addresses, `<locals>`, internal
attributes)
- Smart prompt truncation (80 chars max)
- Function signature display for decorator-based metrics
**Before:**
```python
create_metric_decorator.<locals>.decorator_factory.<locals>.decorator.<locals>.CustomMetric(name='summary_accuracy', _func=<function summary_accuracy at 0x151ffdf80>, ...)
```
**After:**
```python
# LLM-based metrics
DiscreteMetric(name='response_quality', allowed_values=['correct', 'incorrect'], prompt='Evaluate if the response...')
# Decorator-based metrics
summary_accuracy(user_input, response) -> DiscreteMetric[['pass', 'fail']]
```
### 3. Response Model Handling
- Added `create_auto_response_model()` factory to mark auto-generated
models
- Only warns about custom response models during save, not standard ones
## Usage Examples
```python
# Save metric with default path
metric.save() # → ./response_quality.json
# Save with custom path
metric.save("custom.json")
metric.save("/path/to/metrics/") # → /path/to/metrics/response_quality.json
metric.save("compressed.json.gz") # Compressed
# Load metric
loaded_metric = DiscreteMetric.load("response_quality.json")
# For DynamicFewShotPrompt metrics
loaded_metric = DiscreteMetric.load("metric.json", embedding_model=embeddings)
```
## Testing
- Comprehensive test suite with 8 tests covering all save/load scenarios
- Tests for default paths, directory handling, compression
- Tests for all prompt types and metric subclasses
## Dependencies
**Note:** This PR builds on #2316 (Fix metric inheritance patterns) and
requires it to be merged first. The changes here depend on the
cleaned-up metric inheritance structure from that PR.
## Checklist
- [x] Tests added
- [x] Documentation in docstrings
- [x] Backwards compatible (new functionality only)
- [x] Follows TDD practices1 parent 19caa7a commit 7458687Copy full SHA for 7458687
File tree
Expand file treeCollapse file tree
7 files changed
+904
-20
lines changedOpen diff view settings
Filter options
- src/ragas/metrics
- tests/unit
Expand file treeCollapse file tree
7 files changed
+904
-20
lines changedOpen diff view settings
0 commit comments