-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add OnePixelShortcutAttack poisoning attack and its unit tests #2720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OnePixelShortcutAttack poisoning attack and its unit tests #2720
Conversation
fbed197
to
17257dc
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev_1.21.0 #2720 +/- ##
==============================================
+ Coverage 83.30% 85.22% +1.92%
==============================================
Files 330 331 +1
Lines 29539 29894 +355
Branches 5007 5023 +16
==============================================
+ Hits 24607 25477 +870
+ Misses 3516 2981 -535
- Partials 1416 1436 +20
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new data poisoning attack implementation called OnePixelShortcutAttack
to the Adversarial Robustness Toolbox (ART). The attack perturbs a single pixel in each training image at a consistent location per class to create "unlearnable" examples that degrade model accuracy on clean data.
- Implementation of the One Pixel Shortcut (OPS) attack as a new poisoning attack class
- Comprehensive unit test suite validating attack behavior and integration with ART estimators
- Updates to package dependencies and CI configurations
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
art/attacks/poisoning/one_pixel_shortcut_attack.py | Core implementation of the OnePixelShortcutAttack class with pixel perturbation logic |
tests/attacks/poison/test_one_pixel_shortcut_attack.py | Comprehensive unit tests covering various scenarios and edge cases |
art/attacks/poisoning/init.py | Adds import for the new attack class |
requirements_test.txt | Updates dependency versions for testing infrastructure |
.github/workflows/dockerhub.yml | Updates Docker action versions |
.github/workflows/ci-huggingface.yml | Adds safetensors dependency and updates filtering logic |
Update:
Kindly re‑run CI and re‑review, happy to adjust further if needed. Thank you @beat-buesser |
Hi @beat-buesser may I know why there is still 1 pending check regarding PyTorch 2.6.0 (Python 3.10) (Expected — Waiting for status to be reported) ? I think from the last check, the only issue was the CI Style Checks but now it is successful. The Codecov part also passed this time with higher percentage of patch coverage. Please let me know if there is something I need to further adjust, thank you! |
It got replaced by PyTorch 2.8.0, I have now updated the settings for this target branch. |
I'll add a review in the coming days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nicholasadriel Thank you for your pull request. I have added a few review comments, could you please take a look?
Are you able to reproduce the results of paper by Wu et al. with this code?
050689d
to
4461f32
Compare
Hi @beat-buesser I have reviewed the comments, performed a quick test, and everything's fine. Regarding reproducing the results of paper by Wu et al., I wasn’t able to reproduce the exact numbers, but the core OPS algorithm here mirrors the reference implementation (single per-class pixel chosen by a stability/deviation criterion; same coordinate and color applied to all images of that class; labels unchanged; model-free). Reproduction snapshot (CIFAR-10, ResNet-18, 200 epochs): While the absolute accuracies differ, the effect size of the attack is very close, which supports that the implementation captures the intended behavior. The gap in absolute numbers is likely due to training-pipeline differences (data preprocessing/augmentation, normalization, optimizer/schedule, weight decay, and seeds). |
@nicholasadriel I forgot, we still need |
4461f32
to
00ecd86
Compare
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]> Signed-off-by: Nicholas Audric Adriel <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
00ecd86
to
718e818
Compare
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Hi @beat-buesser thank you for the review, I have added the import statement and corrected few typing issues based on last review results. Please kindly re‑run CI and re‑review, happy to adjust further if needed. Thank you. |
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Hi @beat-buesser sorry, can you please kindly re‑run CI and re‑review again? I misplaced the from future import annotations thus failed some tests, but I have corrected it just now. Thanks! |
Hi @beat-buesser since the last checks have already passed, what will be the next step regarding this pull request? Thanks! |
Hi @nicholasadriel I think from a review point of view it is now ready for merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nicholasadriel Thank you very much for your pull request and contributing to ART!
Description
This PR adds a new data poisoning attack class, OnePixelShortcutAttack, to the Adversarial Robustness Toolbox. The class is implemented under the art.attacks.poisoning module, and it introduces support for the One Pixel Shortcut (OPS) attack in ART. A corresponding unit test suite (test_one_pixel_shortcut_attack.py) is also included to validate the correct behavior of the attack implementation.
Motivation
The One Pixel Shortcut attack is a recently proposed poisoning technique that perturbs a single pixel in each training image (in a consistent location per class) to create "unlearnable" examples. This can dramatically degrade a model’s accuracy on clean data without altering the labels. By integrating OPS into IBM ART, we enable standardized evaluation of this attack using ART’s framework and estimators. The implementation has been tested on and extends support to popular image classification datasets such as CIFAR-10, CIFAR-100, UTKFace, and CelebA, ensuring the attack’s broad applicability. Incorporating OPS aligns with ART’s benchmarking and reproducibility goals, expanding the library’s coverage of state-of-the-art poisoning attacks.
Fixes
No open issue is associated with this PR (new feature contribution).
Type of change
New feature (non-breaking change which adds functionality)
Testing
Unit tests have been added in test_one_pixel_shortcut_attack.py to verify the implementation’s correctness:
All tests pass, confirming that the attack behaves as expected and integrates correctly with ART’s data and estimator APIs.
Test Configuration
No additional configuration or dependencies are required for this feature. The OnePixelShortcutAttack can be used out-of-the-box with ART’s existing classifiers and datasets, similar to other poisoning attacks in the library.
Checklist
Reference
Shutong Wu, Sizhe Chen, Cihang Xie, and Xiaolin Huang. One-pixel shortcut: On the learning preference of deep neural networks. In Proc. of ICLR 2023