Hi, thanks for releasing this benchmark! I have two questions:
- The paper mentions the Diamond subset contains 237 IC-SWE instances, but in the repo at https://github.com/openai/preparedness/blob/main/project/swelancer/all_swelancer_tasks.csv, there are only 198. Where are the remaining examples?
- How can I get access to the full dataset?
Hi, thanks for releasing this benchmark! I have two questions: