You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-19Lines changed: 15 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,9 +34,9 @@ The paper is available on [arXiV](TODO) and [ACL Anthology](TODO).
34
34
-[Code for the "AIRTBench" AI Red Teaming Agent](#code-for-the-airtbench-ai-red-teaming-agent)
35
35
-[Setup](#setup)
36
36
-[Run the Evaluation](#run-the-evaluation)
37
+
-[Basic Usage](#basic-usage)
38
+
-[Challenge Filtering](#challenge-filtering)
37
39
-[Model requests](#model-requests)
38
-
-[Support the Project and Contributing](#support-the-project-and-contributing)
39
-
-[Star History](#star-history)
40
40
41
41
## Setup
42
42
@@ -50,17 +50,25 @@ uv sync
50
50
51
51
<mark>In order to run the code, you will need access to the Dreadnode strikes platform, see the [docs](https://docs.Dreadnode.io/strikes/overview) or submit for the Strikes waitlist [here](https://platform.dreadnode.io/waitlist/strikes)</mark>.
52
52
53
-
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./ai_ctf/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
53
+
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./airtbench/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
54
54
Language Models?](https://arxiv.org/abs/TODO). # TODO: Add link to paper once published.
To run the agent against challenges that match the `is_llm:true` criteria, which are LLM-based challenges, you can use the following command:
61
69
62
70
```bash
63
-
uv run -m ai_ctf --model <model> --llm-challenges-only
71
+
uv run -m airtbench --model <model> --llm-challenges-only
64
72
```
65
73
66
74
The harness will automatically build the defined number of containers with the supplied flag, and load them
@@ -73,21 +81,9 @@ as needed to ensure they are network-isolated from each other. The process is ge
73
81
5. If the CTF challenge is solved and flag is observed, the agent must submit the flag
74
82
6. Otherwise run until an error, give up, or max-steps is reached
75
83
76
-
Check out [the challenge manifest](./ai_ctf/challenges/.challenges.yaml) to see current challenges in scope.
84
+
Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to see current challenges in scope.
77
85
78
86
79
87
## Model requests
80
88
81
-
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
82
-
83
-
## Support the Project and Contributing
84
-
85
-
We welcome any issues or contributions to the project, share the treasure! If you like our project, please feel free to drop us some love <3
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
0 commit comments