Skip to content

Commit 02d0fd6

Browse files
feat: airtbench ai agent code (#1)
* feat: airtbench ai agent code * fix: add groq generator for pr janitor * chore: add env example placeholders * fix: rm dup torchvision * fix: rm callisto refs * fix: rm callisto refs in notebook * chore: rm duckdb from toml * chore: bear4 notebook metadata * chore: standardize single env var for challenges n strikes * chore: bear4 queries feedback * chore: neutral terminology in bear4 notebook * chore: add max retries to flag submission * fix: platform hyperlink
1 parent 72cf1cf commit 02d0fd6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+11461
-41
lines changed

.env.example

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
DREADNODE_SERVER_URL="https://platform.dreadnode.io"
2+
# See https://platform.dreadnode.io/account to get your API token and key (same value)
3+
DREADNODE_TOKEN=YOUR_DREADNODE_API_TOKEN
4+
DREADNODE_LOCAL_DIR="runs/"
5+
LOGFIRE_IGNORE_NO_CONFIG=1
6+
7+
# AI provider API keys (replace <ADD_API_KEY> with your actual key)
8+
ANTHROPIC_API_KEY=<ADD_API_KEY>
9+
GROQ_API_KEY=<ADD_API_KEY>
10+
OPENAI_API_KEY=<ADD_API_KEY>
11+
TOGETHER_AI_API_KEY=<ADD_API_KEY>
12+
GEMINI_API_KEY=<ADD_API_KEY>

.gitignore

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -172,15 +172,6 @@ runs/*.jsonl
172172
# Datasets
173173

174174
datasets/
175-
176-
# Callisto
177-
178-
callisto/challenges/public*
179-
180-
# JSON we want
181-
182-
!callisto/analysis/openai_challenges.json
183-
184175
*.parquet
185176
*.json
186177
*.csv

.hooks/generate_pr_description.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ def get_diff(base_ref: str, source_ref: str, *, exclude: list[str] | None = None
6666
def main(
6767
base_ref: str = "origin/main",
6868
source_ref: str = "HEAD",
69-
generator_id: str = "openai/gpt-4o-mini",
69+
generator_id: str = "groq/meta-llama/llama-4-scout-17b-16e-instruct",
7070
max_diff_lines: int = 1000,
7171
exclude: list[str] | None = None,
7272
) -> None:

.pre-commit-config.yaml

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,6 @@ repos:
7474
- "types-PyYAML"
7575
- "types-requests"
7676
- "types-setuptools"
77-
exclude: (callisto/scripts/|scripts/challenge_manager\.py)
7877

7978
- repo: local
8079
hooks:
@@ -90,14 +89,3 @@ repos:
9089
entry: .hooks/prettier.sh
9190
language: script
9291
types: [json, yaml]
93-
94-
- id: check-challenges
95-
name: Check challenge updates
96-
entry: python scripts/challenge_manager.py --check
97-
language: python
98-
pass_filenames: false
99-
always_run: true
100-
files: ^callisto/challenges/
101-
additional_dependencies:
102-
- pyyaml
103-
- jinja2

.vscode/settings.json

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"[python]": {
3+
"editor.formatOnSave": true,
4+
"editor.codeActionsOnSave": {
5+
"source.fixAll": "explicit",
6+
"source.organizeImports": "explicit"
7+
},
8+
"editor.defaultFormatter": "charliermarsh.ruff"
9+
},
10+
"python.testing.pytestArgs": [
11+
"tests"
12+
],
13+
"python.testing.unittestEnabled": false,
14+
"python.testing.pytestEnabled": true,
15+
"mypy.runUsingActiveInterpreter": true,
16+
"debugpy.debugJustMyCode": false,
17+
"jupyter.debugJustMyCode": false
18+
}

README.md

Lines changed: 15 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@ The paper is available on [arXiV](TODO) and [ACL Anthology](TODO).
3434
- [Code for the "AIRTBench" AI Red Teaming Agent](#code-for-the-airtbench-ai-red-teaming-agent)
3535
- [Setup](#setup)
3636
- [Run the Evaluation](#run-the-evaluation)
37+
- [Basic Usage](#basic-usage)
38+
- [Challenge Filtering](#challenge-filtering)
3739
- [Model requests](#model-requests)
38-
- [Support the Project and Contributing](#support-the-project-and-contributing)
39-
- [Star History](#star-history)
4040

4141
## Setup
4242

@@ -50,17 +50,25 @@ uv sync
5050

5151
<mark>In order to run the code, you will need access to the Dreadnode strikes platform, see the [docs](https://docs.Dreadnode.io/strikes/overview) or submit for the Strikes waitlist [here](https://platform.dreadnode.io/waitlist/strikes)</mark>.
5252

53-
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./ai_ctf/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
53+
This [rigging](https://docs.dreadnode.io/open-source/rigging/intro)-based agent works to solve a variety of AI ML CTF challenges from the dreadnode [Crucible](https://platform.dreadnode.io/crucible) platform and given access to execute python commands on a network-local container with custom [Dockerfile](./airtbench/container/Dockerfile). This example-agent is also a compliment to our research paper [AIRTBench: Can Language Models Autonomously Exploit
5454
Language Models?](https://arxiv.org/abs/TODO). # TODO: Add link to paper once published.
5555

5656
```bash
57-
uv run -m ai_ctf --help
57+
uv run -m airtbench --help
5858
```
5959

60+
### Basic Usage
61+
62+
```bash
63+
uv run -m airtbench --model $MODEL --project $PROJECT --platform-api-key $DREADNODE_TOKEN --token $DREADNODE_TOKEN --server https://platform.dreadnode.io --max-steps 100 --inference_timeout 240 --enable-cache --no-give-up --challenges bear1 bear2
64+
```
65+
66+
### Challenge Filtering
67+
6068
To run the agent against challenges that match the `is_llm:true` criteria, which are LLM-based challenges, you can use the following command:
6169

6270
```bash
63-
uv run -m ai_ctf --model <model> --llm-challenges-only
71+
uv run -m airtbench --model <model> --llm-challenges-only
6472
```
6573

6674
The harness will automatically build the defined number of containers with the supplied flag, and load them
@@ -73,21 +81,9 @@ as needed to ensure they are network-isolated from each other. The process is ge
7381
5. If the CTF challenge is solved and flag is observed, the agent must submit the flag
7482
6. Otherwise run until an error, give up, or max-steps is reached
7583

76-
Check out [the challenge manifest](./ai_ctf/challenges/.challenges.yaml) to see current challenges in scope.
84+
Check out [the challenge manifest](./airtbench/challenges/.challenges.yaml) to see current challenges in scope.
7785

7886

7987
## Model requests
8088

81-
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.
82-
83-
## Support the Project and Contributing
84-
85-
We welcome any issues or contributions to the project, share the treasure! If you like our project, please feel free to drop us some love <3
86-
87-
### Star History
88-
89-
[![GitHub stars](https://img.shields.io/github/stars/dreadnode/AIRTBench-Code?style=social)](https://github.com/dreadnode/AIRTBench-Code/stargazers)
90-
91-
By watching the repo, you can also be notified of any upcoming releases.
92-
93-
<img src="https://api.star-history.com/svg?repos=dreadnode/AIRTBench-Code&type=Date" width="600" height="400">
89+
If you know of a model that may be interesting to analyze, but do not have the resources to run it yourself, feel free to open a feature request via a GitHub issue.

airtbench/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
*.bak
2+
*.removed_notebooks/

airtbench/__init__.py

Whitespace-only changes.

airtbench/__main__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
from .main import app
2+
3+
if __name__ == "__main__":
4+
app()

airtbench/challenges.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import pathlib
2+
3+
import yaml # type: ignore [import-untyped]
4+
from pydantic import BaseModel
5+
6+
current_dir = pathlib.Path(__file__).parent
7+
challenges_dir = current_dir / "challenges"
8+
9+
10+
class Challenge(BaseModel):
11+
id: str
12+
name: str
13+
category: str
14+
difficulty: str
15+
notebook: str
16+
is_llm: bool = False
17+
18+
19+
def load_challenges() -> list[Challenge]:
20+
"""
21+
Load challenges from the .challenges.yaml file in the challenges directory.
22+
23+
Returns:
24+
list[Challenge]: A list of Challenge objects.
25+
"""
26+
with (challenges_dir / ".challenges.yaml").open() as f:
27+
return [Challenge(id=key, **challenge) for key, challenge in yaml.safe_load(f).items()]

0 commit comments

Comments
 (0)