Skip to content

Comments

Change the attack interface to work on the entire dataset#88

Draft
evtimovi wants to merge 3 commits intomainfrom
full_dataset_attacks
Draft

Change the attack interface to work on the entire dataset#88
evtimovi wants to merge 3 commits intomainfrom
full_dataset_attacks

Conversation

@evtimovi
Copy link
Contributor

For many attack types, the attack itself should have control over the entire dataset and rollouts. So here we update the interface accordingly and introduce a rollout executor to manage that. We also wrap the simple attacks under this more new interface.
Note: Right now, I am putting this up as a draft for architectural discussions and I have not yet checked all implementation details or tests, so the pull request may evolve.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 20, 2026
@evtimovi evtimovi marked this pull request as draft January 20, 2026 22:10
@evtimovi evtimovi requested review from arman-z and dedeswim January 20, 2026 22:10
@meta-codesync
Copy link

meta-codesync bot commented Feb 4, 2026

@evtimovi has imported this pull request. If you are a Meta employee, you can view this in D92270921.

facebook-github-bot pushed a commit that referenced this pull request Feb 6, 2026
Summary:
For many attack types, the attack itself should have control over the entire dataset and rollouts. So here we update the interface accordingly and introduce a rollout executor to manage that. We also wrap the simple attacks under this more new interface.
**Note**: Right now,  I am putting this up as a draft for architectural discussions and I have not yet checked all implementation details or tests, so the pull request may evolve.


Differential Revision: D92270921

Pulled By: evtimovi
@meta-codesync
Copy link

meta-codesync bot commented Feb 6, 2026

@evtimovi has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92270921.

facebook-github-bot pushed a commit that referenced this pull request Feb 6, 2026
Summary:
For many attack types, the attack itself should have control over the entire dataset and rollouts. So here we update the interface accordingly and introduce a rollout executor to manage that. We also wrap the simple attacks under this more new interface.
**Note**: Right now,  I am putting this up as a draft for architectural discussions and I have not yet checked all implementation details or tests, so the pull request may evolve.


Differential Revision: D92270921

Pulled By: evtimovi
facebook-github-bot pushed a commit that referenced this pull request Feb 6, 2026
Summary:
For many attack types, the attack itself should have control over the entire dataset and rollouts. So here we update the interface accordingly and introduce a rollout executor to manage that. We also wrap the simple attacks under this more new interface.
**Note**: Right now,  I am putting this up as a draft for architectural discussions and I have not yet checked all implementation details or tests, so the pull request may evolve.


Differential Revision: D92270921

Pulled By: evtimovi
Ivan Evtimov added 2 commits February 6, 2026 10:43
#113)

Summary:
When computing pass@k metrics with k > 1, tasks that have fewer than k samples are now gracefully skipped with a warning log message instead of raising a ValueError that would terminate the entire results processing.

Changes:
- Replace ValueError with warning log when n_samples < k for a task
- Add logging module import and logger instance
- Collect skipped groups and log them with full context (dataset, agent, attack, task_id, and sample count)
- Add check for empty DataFrame after filtering in aggregate_results
- Update docstrings to reflect new behavior (Note instead of Raises)
- Also includes refactoring: remove job_name from group_cols to allow aggregating across multiple runs of the same experiment
- Add generic variant_name support alongside legacy template_short_name


Differential Revision: D92393526

Pulled By: evtimovi
Summary:
For many attack types, the attack itself should have control over the entire dataset and rollouts. So here we update the interface accordingly and introduce a rollout executor to manage that. We also wrap the simple attacks under this more new interface.
**Note**: Right now,  I am putting this up as a draft for architectural discussions and I have not yet checked all implementation details or tests, so the pull request may evolve.


Differential Revision: D92270921

Pulled By: evtimovi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant