Skip to content

Conversation

nicholasadriel
Copy link

@nicholasadriel nicholasadriel commented Aug 9, 2025

Description

This PR adds a new data poisoning attack class, OnePixelShortcutAttack, to the Adversarial Robustness Toolbox. The class is implemented under the art.attacks.poisoning module, and it introduces support for the One Pixel Shortcut (OPS) attack in ART. A corresponding unit test suite (test_one_pixel_shortcut_attack.py) is also included to validate the correct behavior of the attack implementation.

Motivation

The One Pixel Shortcut attack is a recently proposed poisoning technique that perturbs a single pixel in each training image (in a consistent location per class) to create "unlearnable" examples. This can dramatically degrade a model’s accuracy on clean data without altering the labels. By integrating OPS into IBM ART, we enable standardized evaluation of this attack using ART’s framework and estimators. The implementation has been tested on and extends support to popular image classification datasets such as CIFAR-10, CIFAR-100, UTKFace, and CelebA, ensuring the attack’s broad applicability. Incorporating OPS aligns with ART’s benchmarking and reproducibility goals, expanding the library’s coverage of state-of-the-art poisoning attacks.

Fixes

No open issue is associated with this PR (new feature contribution).

Type of change

New feature (non-breaking change which adds functionality)

Testing

Unit tests have been added in test_one_pixel_shortcut_attack.py to verify the implementation’s correctness:

  • Output shape: Ensures the poisoned data produced by OnePixelShortcutAttack has the same shape as the original input data (no unintended dimensionality changes).
  • Label preservation: Confirms that the attack does not alter the class labels of the dataset (the labels remain unchanged after poisoning).
  • Per-class pixel perturbation: Verifies that exactly one pixel per class is consistently perturbed across all images of that class, validating the intended one-pixel shortcut behavior.

All tests pass, confirming that the attack behaves as expected and integrates correctly with ART’s data and estimator APIs.

Test Configuration

No additional configuration or dependencies are required for this feature. The OnePixelShortcutAttack can be used out-of-the-box with ART’s existing classifiers and datasets, similar to other poisoning attacks in the library.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • My changes have been tested using both CPU and GPU devices

Reference

Shutong Wu, Sizhe Chen, Cihang Xie, and Xiaolin Huang. One-pixel shortcut: On the learning preference of deep neural networks. In Proc. of ICLR 2023

@nicholasadriel nicholasadriel force-pushed the one-pixel-shortcut-attack branch from fbed197 to 17257dc Compare August 9, 2025 04:08
@beat-buesser beat-buesser changed the base branch from main to dev_1.21.0 August 11, 2025 12:22
@beat-buesser beat-buesser changed the base branch from dev_1.21.0 to main August 11, 2025 12:24
@beat-buesser beat-buesser changed the base branch from main to dev_1.21.0 August 11, 2025 12:25
Copy link

codecov bot commented Aug 11, 2025

Codecov Report

❌ Patch coverage is 98.64865% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 85.22%. Comparing base (293bd22) to head (8e7421b).
⚠️ Report is 31 commits behind head on dev_1.21.0.

Files with missing lines Patch % Lines
art/attacks/poisoning/one_pixel_shortcut_attack.py 98.63% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff               @@
##           dev_1.21.0    #2720      +/-   ##
==============================================
+ Coverage       83.30%   85.22%   +1.92%     
==============================================
  Files             330      331       +1     
  Lines           29539    29894     +355     
  Branches         5007     5023      +16     
==============================================
+ Hits            24607    25477     +870     
+ Misses           3516     2981     -535     
- Partials         1416     1436      +20     
Files with missing lines Coverage Δ
art/attacks/poisoning/__init__.py 100.00% <100.00%> (ø)
art/attacks/poisoning/one_pixel_shortcut_attack.py 98.63% <98.63%> (ø)

... and 276 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@beat-buesser beat-buesser self-requested a review August 11, 2025 14:48
@Copilot Copilot AI review requested due to automatic review settings August 12, 2025 00:34
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new data poisoning attack implementation called OnePixelShortcutAttack to the Adversarial Robustness Toolbox (ART). The attack perturbs a single pixel in each training image at a consistent location per class to create "unlearnable" examples that degrade model accuracy on clean data.

  • Implementation of the One Pixel Shortcut (OPS) attack as a new poisoning attack class
  • Comprehensive unit test suite validating attack behavior and integration with ART estimators
  • Updates to package dependencies and CI configurations

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
art/attacks/poisoning/one_pixel_shortcut_attack.py Core implementation of the OnePixelShortcutAttack class with pixel perturbation logic
tests/attacks/poison/test_one_pixel_shortcut_attack.py Comprehensive unit tests covering various scenarios and edge cases
art/attacks/poisoning/init.py Adds import for the new attack class
requirements_test.txt Updates dependency versions for testing infrastructure
.github/workflows/dockerhub.yml Updates Docker action versions
.github/workflows/ci-huggingface.yml Adds safetensors dependency and updates filtering logic

@nicholasadriel
Copy link
Author

Update:

  1. Worked on the Codecov patch coverage issue (previously 80.82192% with 14 lines missing) by extending unit tests to hit the remaining branches (shape routing for NHWC/NCHW, one‑hot label handling, empty‑class skip, and best‑coord guard).
  2. Ran Black and fixed pycodestyle/mypy findings to address the Style Check.

Kindly re‑run CI and re‑review, happy to adjust further if needed. Thank you @beat-buesser

@beat-buesser beat-buesser self-assigned this Aug 12, 2025
@nicholasadriel
Copy link
Author

Hi @beat-buesser may I know why there is still 1 pending check regarding PyTorch 2.6.0 (Python 3.10) (Expected — Waiting for status to be reported) ?

I think from the last check, the only issue was the CI Style Checks but now it is successful. The Codecov part also passed this time with higher percentage of patch coverage.

Please let me know if there is something I need to further adjust, thank you!

@beat-buesser
Copy link
Collaborator

Hi @beat-buesser may I know why there is still 1 pending check regarding PyTorch 2.6.0 (Python 3.10) (Expected — Waiting for status to be reported) ?

It got replaced by PyTorch 2.8.0, I have now updated the settings for this target branch.

@beat-buesser
Copy link
Collaborator

I'll add a review in the coming days.

@beat-buesser beat-buesser added this to the ART 1.21.0 milestone Aug 13, 2025
@beat-buesser beat-buesser added the enhancement New feature or request label Aug 13, 2025
@beat-buesser beat-buesser moved this to In Progress in ART 1.21.0 Aug 13, 2025
Copy link
Collaborator

@beat-buesser beat-buesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nicholasadriel Thank you for your pull request. I have added a few review comments, could you please take a look?

Are you able to reproduce the results of paper by Wu et al. with this code?

@nicholasadriel
Copy link
Author

nicholasadriel commented Aug 18, 2025

Hi @nicholasadriel Thank you for your pull request. I have added a few review comments, could you please take a look?

Are you able to reproduce the results of paper by Wu et al. with this code?

Hi @beat-buesser I have reviewed the comments, performed a quick test, and everything's fine.

Regarding reproducing the results of paper by Wu et al., I wasn’t able to reproduce the exact numbers, but the core OPS algorithm here mirrors the reference implementation (single per-class pixel chosen by a stability/deviation criterion; same coordinate and color applied to all images of that class; labels unchanged; model-free).

Reproduction snapshot (CIFAR-10, ResNet-18, 200 epochs):
Wu et al. (paper): Clean accuracy 94.01%, OPS-poisoned accuracy 15.56%, Drop 78.45%
This ART implementation: Clean accuracy 89.22%, OPS-poisoned accuracy 7.64%, Drop 81.58%

While the absolute accuracies differ, the effect size of the attack is very close, which supports that the implementation captures the intended behavior. The gap in absolute numbers is likely due to training-pipeline differences (data preprocessing/augmentation, normalization, optimizer/schedule, weight decay, and seeds).

@beat-buesser
Copy link
Collaborator

@nicholasadriel I forgot, we still need from __future__ import annotations for the new typing in some of the test runs.

@nicholasadriel nicholasadriel force-pushed the one-pixel-shortcut-attack branch from 4461f32 to 00ecd86 Compare August 20, 2025 10:34
nicholasadriel and others added 19 commits August 20, 2025 11:48
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Co-authored-by: Beat Buesser <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
Signed-off-by: Nicholas Audric Adriel <[email protected]>
@nicholasadriel nicholasadriel force-pushed the one-pixel-shortcut-attack branch from 00ecd86 to 718e818 Compare August 20, 2025 10:49
@nicholasadriel
Copy link
Author

@nicholasadriel I forgot, we still need from __future__ import annotations for the new typing in some of the test runs.

Hi @beat-buesser thank you for the review, I have added the import statement and corrected few typing issues based on last review results. Please kindly re‑run CI and re‑review, happy to adjust further if needed. Thank you.

Signed-off-by: Nicholas Audric Adriel <[email protected]>
@nicholasadriel
Copy link
Author

@nicholasadriel I forgot, we still need from __future__ import annotations for the new typing in some of the test runs.

Hi @beat-buesser sorry, can you please kindly re‑run CI and re‑review again? I misplaced the from future import annotations thus failed some tests, but I have corrected it just now. Thanks!

@nicholasadriel
Copy link
Author

Hi @beat-buesser since the last checks have already passed, what will be the next step regarding this pull request? Thanks!

@beat-buesser
Copy link
Collaborator

Hi @nicholasadriel I think from a review point of view it is now ready for merging.

Copy link
Collaborator

@beat-buesser beat-buesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nicholasadriel Thank you very much for your pull request and contributing to ART!

@beat-buesser beat-buesser merged commit b3bad7f into Trusted-AI:dev_1.21.0 Aug 22, 2025
25 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in ART 1.21.0 Aug 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants