Skip to content

Commit a23d512

Browse files
authored
Fix links to the blogpost (README.md)
Thanks for sharing the benchmark The PR fixes the links to the blogpost ;) Signed-off-by: Josselin Feist <[email protected]>
1 parent 2ab5cd4 commit a23d512

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222

2323
[![arXiv](https://img.shields.io/badge/arXiv-AIRTBench-b31b1b.svg)](https://arxiv.org/abs/2506.14682)
2424
[![HuggingFace](https://img.shields.io/badge/🤗%20HuggingFace-Dataset-ffca28.svg)](https://huggingface.co/datasets/dreadnode/AIRTBench/blob/main/README.md)
25-
[![Dreadnode](https://img.shields.io/badge/Dreadnode-Blog-5714928f.svg)](https://dreadnode.io/blog/ai-red-team-benchmark)
25+
[![Dreadnode](https://img.shields.io/badge/Dreadnode-Blog-5714928f.svg)](https://dreadnode.io/blog/ai-red-team-benchmark-)
2626
[![Agent Harness](https://img.shields.io/badge/📚_Agent_Harness-Documentation-5714928f.svg)](https://docs.dreadnode.io/strikes/how-to/airtbench-agent)
2727

2828
[![GitHub stars](https://img.shields.io/github/stars/dreadnode/AIRTBench-Code?style=social)](https://github.com/dreadnode/AIRTBench-Code/stargazers)
@@ -33,7 +33,7 @@
3333

3434
---
3535

36-
This repository contains the implementation of the AIRTBench autonomous AI red teaming agent, complementing our research paper [AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models](https://arxiv.org/abs/2506.14682) and accompanying blog post, "[Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out](https://dreadnode.io/blog/ai-red-team-benchmark)".
36+
This repository contains the implementation of the AIRTBench autonomous AI red teaming agent, complementing our research paper [AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models](https://arxiv.org/abs/2506.14682) and accompanying blog post, "[Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out](https://dreadnode.io/blog/ai-red-team-benchmark-)".
3737

3838
The AIRTBench agent is designed to evaluate the autonomous red teaming capabilities of large language models (LLMs) through AI/ML Capture The Flag (CTF) challenges. Our agent systematically exploits LLM-based targets by solving challenges on the Dreadnode Strikes platform, providing a standardized benchmark for measuring adversarial AI capabilities.
3939

@@ -110,7 +110,7 @@ Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to s
110110
## Resources
111111

112112
- [📄 Paper on arXiv](https://arxiv.org/abs/2506.14682)
113-
- [📝 Blog post](https://dreadnode.io/blog/ai-red-team-benchmark)
113+
- [📝 Blog post](https://dreadnode.io/blog/ai-red-team-benchmark-)
114114

115115
## Dataset
116116

@@ -135,4 +135,4 @@ If you find our work helpful, please use the following citations.
135135

136136
## Model requests
137137

138-
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
138+
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.

0 commit comments

Comments
 (0)