Skip to content

Conversation

@GangGreenTemperTatum
Copy link
Collaborator

@GangGreenTemperTatum GangGreenTemperTatum commented Jun 14, 2025

add failed flag submissions csv as part of dataset

Key Changes:

  • add failed flag submissions csv as part of dataset

Added:

  • add failed flag submissions csv as part of dataset

Generated Summary:

  • Changed the .gitignore entry from “datasets/” to “dataset/” and added an exception for “failed_flag_submissions.csv”, which will now be tracked.
  • Added a new file “dataset/failed_flag_submissions.csv” containing numerous flag submission records (including model display names, challenges, failed attempt counts, and last flag submissions).
  • The update ensures that the critical flag submission data is not excluded by git, which may affect automated flag processing or tracking.

This summary was generated with ❤️ by rigging

@linear
Copy link

linear bot commented Jun 14, 2025

ENG-2239 Add failed_flags.csv to AIRTBench-Code repo and update notes

ads said:

FYI i also want to include the failed_flags.csv so throwing that in the AIRTBench-Code repo and added a bit of notes to the paper under the Spurious Elaboration as i think it'd be good to promote additional transparency into reasoning patterns and failure modes

@GangGreenTemperTatum GangGreenTemperTatum merged commit eca3841 into main Jun 14, 2025
5 checks passed
@GangGreenTemperTatum GangGreenTemperTatum deleted the ads/eng-2239-add-failed_flagscsv-to-airtbench-code-repo-and-update-notes branch June 14, 2025 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants