-
Notifications
You must be signed in to change notification settings - Fork 13
feat: airtbench ai agent code #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
GangGreenTemperTatum
merged 14 commits into
main
from
ads/eng-2127-feat-create-airtbench-code-repo-and-merge-in-example-agents
Jun 5, 2025
Merged
Changes from 4 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
9a5ae64
feat: airtbench ai agent code
GangGreenTemperTatum 04c9dc9
fix: add groq generator for pr janitor
GangGreenTemperTatum 70a6402
Merge branch 'main' into ads/eng-2127-feat-create-airtbench-code-repo…
GangGreenTemperTatum 2d0fbcb
chore: add env example placeholders
GangGreenTemperTatum 38ef6c1
fix: rm dup torchvision
GangGreenTemperTatum cfaf123
fix: rm callisto refs
GangGreenTemperTatum 7902f9d
fix: rm callisto refs in notebook
GangGreenTemperTatum f41a7cd
chore: rm duckdb from toml
GangGreenTemperTatum 53a1e77
chore: bear4 notebook metadata
GangGreenTemperTatum 7e1a771
chore: standardize single env var for challenges n strikes
GangGreenTemperTatum 51e77b6
chore: bear4 queries feedback
GangGreenTemperTatum e245e3a
chore: neutral terminology in bear4 notebook
GangGreenTemperTatum 5e919bd
chore: add max retries to flag submission
GangGreenTemperTatum e68954f
fix: platform hyperlink
GangGreenTemperTatum File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| DREADNODE_SERVER_URL="https://platform.dreadnode.io" | ||
| # See https://platform.dreadnode.io/account to get your API token and key (same value) | ||
| DREADNODE_API_TOKEN=YOUR_DREADNODE_API_TOKEN | ||
| DREADNODE_API_KEY=YOUR_DREADNODE_API_KEY | ||
| DREADNODE_LOCAL_DIR="runs/" | ||
| LOGFIRE_IGNORE_NO_CONFIG=1 | ||
|
|
||
| # AI provider API keys (replace <ADD_API_KEY> with your actual key) | ||
| ANTHROPIC_API_KEY=<ADD_API_KEY> | ||
| GROQ_API_KEY=<ADD_API_KEY> | ||
| OPENAI_API_KEY=<ADD_API_KEY> | ||
| TOGETHER_AI_API_KEY=<ADD_API_KEY> | ||
| GEMINI_API_KEY=<ADD_API_KEY> | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| *.bak | ||
| *.removed_notebooks/ |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| from .main import app | ||
|
|
||
| if __name__ == "__main__": | ||
| app() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| import pathlib | ||
|
|
||
| import yaml # type: ignore [import-untyped] | ||
| from pydantic import BaseModel | ||
|
|
||
| current_dir = pathlib.Path(__file__).parent | ||
| challenges_dir = current_dir / "challenges" | ||
|
|
||
|
|
||
| class Challenge(BaseModel): | ||
| id: str | ||
| name: str | ||
| category: str | ||
| difficulty: str | ||
| notebook: str | ||
| is_llm: bool = False | ||
|
|
||
|
|
||
| def load_challenges() -> list[Challenge]: | ||
| """ | ||
| Load challenges from the .challenges.yaml file in the challenges directory. | ||
|
|
||
| Returns: | ||
| list[Challenge]: A list of Challenge objects. | ||
| """ | ||
| with (challenges_dir / ".challenges.yaml").open() as f: | ||
| return [Challenge(id=key, **challenge) for key, challenge in yaml.safe_load(f).items()] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.