Skip to content

Commit cdba06e

Browse files
committed
Add NSFW docs
1 parent d934a65 commit cdba06e

File tree

3 files changed

+8
-5
lines changed

3 files changed

+8
-5
lines changed
193 KB
Loading

docs/ref/checks/nsfw.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -82,10 +82,12 @@ This benchmark evaluates model performance on a balanced set of social media pos
8282

8383
| Model | ROC AUC | Prec@R=0.80 | Prec@R=0.90 | Prec@R=0.95 | Recall@FPR=0.01 |
8484
|--------------|---------|-------------|-------------|-------------|-----------------|
85-
| gpt-4.1 | 0.989 | 0.976 | 0.962 | 0.962 | 0.717 |
86-
| gpt-4.1-mini (default) | 0.984 | 0.977 | 0.977 | 0.943 | 0.653 |
87-
| gpt-4.1-nano | 0.952 | 0.972 | 0.823 | 0.823 | 0.429 |
88-
| gpt-4o-mini | 0.965 | 0.977 | 0.955 | 0.945 | 0.842 |
85+
| gpt-5 | 0.9532 | 0.9195 | 0.9096 | 0.9068 | 0.0339 |
86+
| gpt-5-mini | 0.9629 | 0.9321 | 0.9168 | 0.9149 | 0.0998 |
87+
| gpt-5-nano | 0.9600 | 0.9297 | 0.9216 | 0.9175 | 0.1078 |
88+
| gpt-4.1 | 0.9603 | 0.9312 | 0.9249 | 0.9192 | 0.0439 |
89+
| gpt-4.1-mini (default) | 0.9520 | 0.9180 | 0.9130 | 0.9049 | 0.0459 |
90+
| gpt-4.1-nano | 0.9502 | 0.9262 | 0.9094 | 0.9043 | 0.0379 |
8991

9092
**Notes:**
9193

mkdocs.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,14 @@ nav:
3838
- "Streaming vs Blocking": streaming_output.md
3939
- Tripwires: tripwires.md
4040
- Checks:
41-
- Prompt Injection Detection: ref/checks/prompt_injection_detection.md
4241
- Contains PII: ref/checks/pii.md
4342
- Custom Prompt Check: ref/checks/custom_prompt_check.md
4443
- Hallucination Detection: ref/checks/hallucination_detection.md
4544
- Jailbreak Detection: ref/checks/jailbreak.md
4645
- Moderation: ref/checks/moderation.md
46+
- NSFW: ref/checks/nsfw.md
4747
- Off Topic Prompts: ref/checks/off_topic_prompts.md
48+
- Prompt Injection Detection: ref/checks/prompt_injection_detection.md
4849
- URL Filter: ref/checks/urls.md
4950
- Evaluation Tool: evals.md
5051
- API Reference:

0 commit comments

Comments
 (0)