Skip to content

Fix (brevitas_examples/imagenet/ptq): DataLoader fix #1420

Merged
nickfraser merged 3 commits intoXilinx:devfrom
xkucerak:example_imagenet_perf
Nov 21, 2025
Merged

Fix (brevitas_examples/imagenet/ptq): DataLoader fix #1420
nickfraser merged 3 commits intoXilinx:devfrom
xkucerak:example_imagenet_perf

Conversation

@xkucerak
Copy link

Reason for this PR

ptq_evaluate.py was running too slowly on the GPU. It was caused by repeatedly loading the dataset into memory. This was mostly noticeable with enabled: GPTQ, GPFQ...

Changes Made in this PR

Increased speed of calibration by preserving the dataset in memory and avoiding unnecessary loading by adding persistent_workers=True to DataLoader.
Also added code to free the calibration dataset from memory before validation.

Testing Summary

Pre:

Starting activation calibration:
100%|█████| 16/16 [00:41<00:00, 2.59s/it]

Performing GPTQ:
100%|█████| 21/21 [11:50<00:00, 33.82s/it]

Applying bias correction:
100%|█████| 16/16 [00:34<00:00, 2.15s/it]

Starting validation:
100%|█████| 196/196 [00:56<00:00, 3.49it/s]
Total:Avg acc@1 63.724

Post:

Starting activation calibration:
100%|█████| 16/16 [00:10<00:00, 1.58it/s]

Performing GPTQ:
100%|█████| 21/21 [00:45<00:00, 2.16s/it]

Applying bias correction:
100%|█████| 16/16 [00:01<00:00, 9.43it/s]

Starting validation:
100%|█████| 196/196 [00:53<00:00, 3.67it/s]
Total:Avg acc@1 63.724

(tested on windows, default_template.yaml+ GPTQ + GPU)

Risk Highlight

  • This PR includes code from another work (please detail).
  • This PR contains API-breaking changes.
  • This PR depends on work in another PR (please provide links/details).
  • This PR introduces new dependencies (please detail).
  • There are coverage gaps not covered by tests.
  • Documentation updates required in subsequent PR.

Checklist

  • Code comments added to any hard-to-understand areas, if applicable.
  • Changes generate no new warnings.
  • Updated any relevant tests, if applicable.
  • No conflicts with destination dev branch.
  • I reviewed my own code changes.
  • Initial CI/CD passing.
  • 1+ reviews given, and any review issues addressed and approved.
  • Post-review full CI/CD passing.

@nickfraser
Copy link
Collaborator

Thanks for the contribution, this looks good. I ran our linting locally (pre-commit run -a please run it yourself for your next PR) to fix the style but otherwise it looks good.

Occasionally, we've had problems using pin_memory=True on certain systems, but the speed improvements you are seeing are big enough that we should make this the default. Just writing it here in case we see problems with this in future.

Otherwise, I'll merge once the tests finish - thanks!

@nickfraser nickfraser self-requested a review November 21, 2025 14:42
Copy link
Collaborator

@nickfraser nickfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (pending tests)

@nickfraser
Copy link
Collaborator

FYI, failing LLM tests are being addressed in #1415.

@nickfraser nickfraser merged commit e522fbb into Xilinx:dev Nov 21, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments