Skip to content

Commit e960908

Browse files
fix typos
1 parent 2050a28 commit e960908

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

detectors/llm_judge/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# LLM Judge Detector
22

3-
The LLM Judge detector integrates the [vLLM Judge](https://github.com/trustyai-explainability/vllm_judge) into the Guardrails Detector ecosystem. Please refer [llm_judge_examples](docs/llm_judge_examples.md) for usage details.
3+
The LLM Judge detector integrates the [vLLM Judge](https://github.com/trustyai-explainability/vllm_judge) into the Guardrails Detector ecosystem. Please refer [llm_judge_examples](../../docs/llm_judge_examples.md) for usage details.
44

55
```
66
oc apply -f deploy/servingruntime.yaml

docs/llm_judge_examples.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ podman run --rm -p 8000:8000 -e VLLM_BASE_URL=http://host.docker.internal:9090 $
2727

2828
Instead of writing complex prompts let us use pre-defined prompts for common scenarios. vLLM-Judge library provides various builtin metrics (which are pre-defined prompts) including moderation (`safety`, `toxicity`, `bias`), general purpose (`helpfulness`, `clarity`, `conciseness`, `relevance`, `coherence`) and others (like code quality, content quality). Let's use one of the pre-defined metrics from vLLM-Judge, eg., `safety`.
2929

30-
The `safety` metric evaluates the content safety across multiple dimensions like physical harm, psycological harm, social harm and vulnerable groups.
30+
The `safety` metric evaluates the content safety across multiple dimensions like physical harm, psychological harm, social harm and vulnerable groups.
3131

3232
**Request:**
3333
```bash
@@ -134,9 +134,9 @@ curl -s -X POST \
134134
]
135135
```
136136

137-
We get pretty ok results where model uses positive label (like 'True') and higher scores (like 1.0) for positive instances i.e, that satisfy the criteria and similarly negative label ('FAIL') and lower score (0.2) for negative instances i.e, that does not satisfy the criteria.
137+
We get pretty ok results where model uses positive label (like 'True') and higher scores (like 1.0) for positive instances i.e., those that satisfy the criteria and similarly negative label ('FAIL') and lower score (0.2) for negative instances i.e., those that do not satisfy the criteria.
138138

139-
But how to specifically say which labels to use and how to assign scores? This is where the `rubric` parameter comes in.
139+
But how do you specify which labels to use and how to assign scores? This is where the `rubric` parameter comes in.
140140

141141
#### Example 3: Custom Labels and Scoring with Rubrics
142142

@@ -265,7 +265,7 @@ Below is the full list of parameters that can be passed to `detector_params` to
265265
- `template_vars`: Variable mapping to substitute in templates
266266
- `template_engine`: Template engine to use ('format' or 'jinja2'), default is 'format'
267267
- `system_prompt`: Custom system message to take full control of the evaluator LLM persona
268-
- `examples`: Few-shot examples. List of JSON objects, each JSON represents an example and must contain `content`, `score`, and `reasoning` fields and `reasoning` fields
268+
- `examples`: Few-shot examples. List of JSON objects, each JSON represents an example and must contain `content`, `score`, and `reasoning` fields
269269

270270
### Get list of pre-defined metric names:
271271

0 commit comments

Comments
 (0)