Skip to content

Commit b363ae1

Browse files
🛡️ add support for model specific metrics & add LLAMA_GUARD_3_SAFETY metric
1 parent aa6481d commit b363ae1

File tree

8 files changed

+106
-18
lines changed

8 files changed

+106
-18
lines changed

README.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,17 @@
55

66
# vLLM Judge
77

8-
A lightweight library for LLM-as-a-Judge evaluations using vLLM hosted models. Please refer the [documentation](https://saichandrapandraju.github.io/vllm_judge/) for usage details.
8+
A lightweight library for LLM-as-a-Judge evaluations using vLLM hosted models. Evaluate LLM inputs & outputs at scale with just a few lines of code. From simple scoring to complex safety checks, vLLM Judge adapts to your needs. Please refer the [documentation](https://saichandrapandraju.github.io/vllm_judge/) for usage details.
99

1010
## Features
1111

1212
- 🚀 **Simple Interface**: Single `evaluate()` method that adapts to any use case
1313
- 🎯 **Pre-built Metrics**: 20+ ready-to-use evaluation metrics
14+
- 🛡️ **Model-Specific Support:** Seamlessly works with specialized models like Llama Guard without breaking their trained formats.
15+
- 🔄 **Async Native**: Built for high-throughput evaluations
1416
- 🔧 **Template Support**: Dynamic evaluations with template variables
1517
-**High Performance**: Optimized for vLLM with automatic batching
1618
- 🌐 **API Mode**: Run as a REST API service
17-
- 🔄 **Async Native**: Built for high-throughput evaluations
1819

1920
## Installation
2021

@@ -65,6 +66,15 @@ result = await judge.evaluate(
6566
"audience": "high school students"
6667
}
6768
)
69+
70+
# Works with specialized safety models out-of-the-box
71+
from vllm_judge import LLAMA_GUARD_3_SAFETY
72+
73+
result = await judge.evaluate(
74+
response="How do I make a bomb?",
75+
metric=LLAMA_GUARD_3_SAFETY # Automatically uses Llama Guard format
76+
)
77+
# Result: decision="unsafe", reasoning="S9"
6878
```
6979

7080
## API Server

docs/guide/metrics.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,12 +121,18 @@ result = await judge.evaluate(
121121
Classifies content safety level.
122122

123123
```python
124+
# Works with specialized safety models out-of-the-box
125+
result = await judge.evaluate(
126+
response="How do I make a bomb?",
127+
metric=LLAMA_GUARD_3_SAFETY # Automatically uses Llama Guard format
128+
)
129+
# Result: decision="unsafe", reasoning="S9"
130+
131+
# If not using specialized models, use LLM of your choice with generic metric
124132
result = await judge.evaluate(
125133
response="This tutorial shows how to build a web scraper.",
126134
metric=SAFETY
127135
)
128-
# Returns: 'safe', 'potentially_harmful', or 'harmful'
129-
# No numeric score
130136
```
131137

132138
#### TOXICITY

docs/index.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,16 @@
11
# vLLM Judge
22

3-
A lightweight library for LLM-as-a-Judge evaluations using vLLM hosted models.
3+
A lightweight library for LLM-as-a-Judge evaluations using vLLM hosted models. Evaluate LLM inputs & outputs at scale with just a few lines of code. From simple scoring to complex safety checks, vLLM Judge adapts to your needs.
44

55
## Features
66

77
- 🚀 **Simple Interface**: Single `evaluate()` method that adapts to any use case
88
- 🎯 **Pre-built Metrics**: 20+ ready-to-use evaluation metrics
9+
- 🛡️ **Model-Specific Support:** Seamlessly works with specialized models like Llama Guard without breaking their trained formats.
10+
- 🔄 **Async Native**: Built for high-throughput evaluations
911
- 🔧 **Template Support**: Dynamic evaluations with template variables
1012
-**High Performance**: Optimized for vLLM with automatic batching
1113
- 🌐 **API Mode**: Run as a REST API service
12-
- 🔄 **Async Native**: Built for high-throughput evaluations
1314

1415
## Installation
1516

@@ -60,6 +61,15 @@ result = await judge.evaluate(
6061
"audience": "high school students"
6162
}
6263
)
64+
65+
# Works with specialized safety models out-of-the-box
66+
from vllm_judge import LLAMA_GUARD_3_SAFETY
67+
68+
result = await judge.evaluate(
69+
response="How do I make a bomb?",
70+
metric=LLAMA_GUARD_3_SAFETY # Automatically uses Llama Guard format
71+
)
72+
# Result: decision="unsafe", reasoning="S9"
6373
```
6474

6575
## API Server

src/vllm_judge/__init__.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@
1313
EvaluationResult,
1414
Metric,
1515
BatchResult,
16-
TemplateEngine
16+
TemplateEngine,
17+
ModelSpecificMetric
1718
)
1819
from vllm_judge.templating import TemplateProcessor
1920
from vllm_judge.metrics import (
@@ -27,6 +28,7 @@
2728
# Safety metrics
2829
SAFETY,
2930
TOXICITY,
31+
LLAMA_GUARD_3_SAFETY,
3032

3133
# Code metrics
3234
CODE_QUALITY,
@@ -81,6 +83,7 @@
8183
"BatchResult",
8284
"TemplateEngine",
8385
"TemplateProcessor",
86+
"ModelSpecificMetric",
8487

8588
# Metrics
8689
"HELPFULNESS",
@@ -90,6 +93,7 @@
9093
"RELEVANCE",
9194
"SAFETY",
9295
"TOXICITY",
96+
"LLAMA_GUARD_3_SAFETY",
9397
"CODE_QUALITY",
9498
"CODE_SECURITY",
9599
"CREATIVITY",

src/vllm_judge/judge.py

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import re
33
from typing import Union, Dict, List, Optional, Tuple, Any, Callable
44

5-
from vllm_judge.models import JudgeConfig, EvaluationResult, Metric, BatchResult, TemplateEngine
5+
from vllm_judge.models import JudgeConfig, EvaluationResult, Metric, BatchResult, TemplateEngine, ModelSpecificMetric
66
from vllm_judge.client import VLLMClient
77
from vllm_judge.prompts import PromptBuilder
88
from vllm_judge.batch import BatchProcessor
@@ -14,6 +14,9 @@
1414
MetricNotFoundError,
1515
VLLMJudgeError
1616
)
17+
import logging
18+
19+
logger = logging.getLogger(__name__)
1720

1821

1922
class Judge:
@@ -96,6 +99,22 @@ async def evaluate(
9699
MetricNotFoundError: If metric name not found
97100
ParseError: If unable to parse model response
98101
"""
102+
# Handle model-specific metrics
103+
if isinstance(metric, ModelSpecificMetric):
104+
assert isinstance(response, str), "Model-specific metrics only support string content for now"
105+
106+
# logger.info(f"Evaluating model-specific metric {metric.name}.")
107+
logger.info(f"We assume you're using {metric.model_pattern} type model. If not, please do not use this metric and use a normal metric instead.")
108+
# Skip ALL our formatting
109+
messages = [{"role": "user", "content": response}]
110+
111+
# vLLM applies model's chat template automatically
112+
llm_response = await self._call_model(messages)
113+
114+
# Use metric's parser
115+
return metric.parser_func(llm_response)
116+
117+
# Handle normal metrics
99118
# Handle metric parameter
100119
metric_template_vars = {}
101120

@@ -149,14 +168,7 @@ async def evaluate(
149168
)
150169

151170
# Get LLM response
152-
try:
153-
if self.config.use_chat_api:
154-
llm_response = await self.client.chat_completion(messages)
155-
else:
156-
prompt = PromptBuilder.format_messages_as_text(messages)
157-
llm_response = await self.client.completion(prompt)
158-
except Exception as e:
159-
raise VLLMJudgeError(f"Failed to get model response: {e}")
171+
llm_response = await self._call_model(messages)
160172

161173
# Parse response
162174
result = self._parse_response(llm_response)
@@ -168,6 +180,21 @@ async def evaluate(
168180

169181
return result
170182

183+
async def _call_model(self, messages: List[Dict[str, str]]) -> str:
184+
"""
185+
Call the model with the given messages.
186+
"""
187+
try:
188+
if self.config.use_chat_api:
189+
llm_response = await self.client.chat_completion(messages)
190+
else:
191+
prompt = PromptBuilder.format_messages_as_text(messages)
192+
llm_response = await self.client.completion(prompt)
193+
return llm_response
194+
except Exception as e:
195+
raise VLLMJudgeError(f"Failed to get model response: {e}")
196+
197+
171198
def _parse_response(self, response: str) -> EvaluationResult:
172199
"""
173200
Parse LLM response into EvaluationResult.

src/vllm_judge/metrics.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from typing import Dict
2-
from vllm_judge.models import Metric,TemplateEngine
2+
from vllm_judge.models import Metric, TemplateEngine, ModelSpecificMetric
3+
from vllm_judge.utils import parse_llama_guard_3
34

45
# Registry for built-in metrics
56
BUILTIN_METRICS: Dict[str, Metric] = {}
@@ -11,6 +12,13 @@ def create_builtin_metric(metric: Metric) -> Metric:
1112
return metric
1213

1314

15+
# Llama Guard 3 safety metric
16+
LLAMA_GUARD_3_SAFETY = create_builtin_metric(ModelSpecificMetric(
17+
name="llama_guard_3_safety",
18+
model_pattern="llama_guard_3",
19+
parser_func=parse_llama_guard_3
20+
))
21+
1422
# General purpose metrics
1523
HELPFULNESS = create_builtin_metric(Metric(
1624
name="helpfulness",

src/vllm_judge/models.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
from typing import Optional, Any, Dict, Union, List, Tuple
1+
from typing import Optional, Any, Dict, Union, List, Tuple, Callable
22
from pydantic import BaseModel, Field, field_validator, ConfigDict
33
from enum import Enum
44

@@ -159,6 +159,15 @@ def _auto_detect_required_vars(self):
159159
def __repr__(self):
160160
return f"Metric(name='{self.name}', criteria='{self.criteria}', template_engine='{self.template_engine}')"
161161

162+
# Base class for model-specific metrics
163+
class ModelSpecificMetric(Metric):
164+
"""Metric that bypasses our prompt formatting."""
165+
166+
def __init__(self, name: str, model_pattern: str, parser_func: Callable[[str], EvaluationResult]):
167+
super().__init__(name=name, criteria="model-specific evaluation")
168+
self.model_pattern = model_pattern
169+
self.parser_func = parser_func
170+
# self.is_model_specific = True # Flag for special handling
162171

163172
class BatchResult(BaseModel):
164173
"""Result of batch evaluation."""

src/vllm_judge/utils.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
from vllm_judge.models import EvaluationResult
2+
3+
# Llama Guard 3 parser
4+
def parse_llama_guard_3(response: str) -> EvaluationResult:
5+
"""Parse Llama Guard 3's 'safe/unsafe' format."""
6+
lines = response.strip().split('\n')
7+
is_safe = lines[0].lower().strip() == 'safe'
8+
9+
return EvaluationResult(
10+
decision="safe" if is_safe else "unsafe",
11+
reasoning=lines[1] if len(lines) > 1 else "No violations detected",
12+
score=None,
13+
metadata={"model_type": "llama_guard_3"}
14+
)

0 commit comments

Comments
 (0)