You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[](https://github.com/dreadnode/AIRTBench-Code/releases)
This repository contains the implementation of the AIRTBench autonomous AI red teaming agent, complementing our research paper [AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models](https://arxiv.org/abs/TODO) and accompanying blog post, "[Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out](https://dreadnode.io/blog/ai-red-team-benchmark)".
36
+
This repository contains the implementation of the AIRTBench autonomous AI red teaming agent, complementing our research paper [AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models](https://arxiv.org/abs/2506.14682) and accompanying blog post, "[Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out](https://dreadnode.io/blog/ai-red-team-benchmark)".
37
37
38
38
The AIRTBench agent is designed to evaluate the autonomous red teaming capabilities of large language models (LLMs) through AI/ML Capture The Flag (CTF) challenges. Our agent systematically exploits LLM-based targets by solving challenges on the Dreadnode Strikes platform, providing a standardized benchmark for measuring adversarial AI capabilities.
39
39
@@ -109,7 +109,7 @@ Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to s
109
109
110
110
## Resources
111
111
112
-
-[📄 Paper on arXiv](https://arxiv.org/abs/TODO)
112
+
-[📄 Paper on arXiv](https://arxiv.org/abs/2506.14682)
113
113
-[📝 Blog post](https://dreadnode.io/blog/ai-red-team-benchmark)
114
114
115
115
## Dataset
@@ -122,12 +122,14 @@ Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to s
122
122
If you find our work helpful, please use the following citations.
123
123
124
124
```bibtex
125
-
@misc{TODO,
126
-
title = {AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models},
127
-
author = {TODO},
128
-
year = {2025},
129
-
eprint = {arXiv:TODO},
130
-
url = {https://arxiv.org/abs/TODO}
125
+
@misc{dawson2025airtbenchmeasuringautonomousai,
126
+
title={AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models},
127
+
author={Ads Dawson and Rob Mulla and Nick Landers and Shane Caldwell},
0 commit comments