Skip to content

Commit c51513d

Browse files
docs: add live arxiv links (#28)
1 parent 6127b7c commit c51513d

File tree

1 file changed

+11
-9
lines changed

1 file changed

+11
-9
lines changed

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
2121
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/dreadnode/AIRTBench-Code)](https://github.com/dreadnode/AIRTBench-Code/releases)
2222

23-
[![arXiv](https://img.shields.io/badge/arXiv-TODO-b31b1b.svg)](https://arxiv.org/abs/TODO)
23+
[![arXiv](https://img.shields.io/badge/arXiv-AIRTBench-b31b1b.svg)](https://arxiv.org/abs/2506.14682)
2424
[![HuggingFace](https://img.shields.io/badge/🤗%20HuggingFace-Dataset-ffca28.svg)](https://huggingface.co/datasets/dreadnode/AIRTBench/blob/main/README.md)
2525
[![Dreadnode](https://img.shields.io/badge/Dreadnode-Blog-5714928f.svg)](https://dreadnode.io/blog/ai-red-team-benchmark)
2626
[![Agent Harness](https://img.shields.io/badge/📚_Agent_Harness-Documentation-5714928f.svg)](https://docs.dreadnode.io/strikes/how-to/airtbench-agent)
@@ -33,7 +33,7 @@
3333

3434
---
3535

36-
This repository contains the implementation of the AIRTBench autonomous AI red teaming agent, complementing our research paper [AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models](https://arxiv.org/abs/TODO) and accompanying blog post, "[Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out](https://dreadnode.io/blog/ai-red-team-benchmark)".
36+
This repository contains the implementation of the AIRTBench autonomous AI red teaming agent, complementing our research paper [AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models](https://arxiv.org/abs/2506.14682) and accompanying blog post, "[Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out](https://dreadnode.io/blog/ai-red-team-benchmark)".
3737

3838
The AIRTBench agent is designed to evaluate the autonomous red teaming capabilities of large language models (LLMs) through AI/ML Capture The Flag (CTF) challenges. Our agent systematically exploits LLM-based targets by solving challenges on the Dreadnode Strikes platform, providing a standardized benchmark for measuring adversarial AI capabilities.
3939

@@ -109,7 +109,7 @@ Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to s
109109

110110
## Resources
111111

112-
- [📄 Paper on arXiv](https://arxiv.org/abs/TODO)
112+
- [📄 Paper on arXiv](https://arxiv.org/abs/2506.14682)
113113
- [📝 Blog post](https://dreadnode.io/blog/ai-red-team-benchmark)
114114

115115
## Dataset
@@ -122,12 +122,14 @@ Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to s
122122
If you find our work helpful, please use the following citations.
123123

124124
```bibtex
125-
@misc{TODO,
126-
title = {AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models},
127-
author = {TODO},
128-
year = {2025},
129-
eprint = {arXiv:TODO},
130-
url = {https://arxiv.org/abs/TODO}
125+
@misc{dawson2025airtbenchmeasuringautonomousai,
126+
title={AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models},
127+
author={Ads Dawson and Rob Mulla and Nick Landers and Shane Caldwell},
128+
year={2025},
129+
eprint={2506.14682},
130+
archivePrefix={arXiv},
131+
primaryClass={cs.CR},
132+
url={https://arxiv.org/abs/2506.14682},
131133
}
132134
```
133135

0 commit comments

Comments
 (0)