Skip to content

Conversation

jingyu-ml
Copy link
Contributor

@jingyu-ml jingyu-ml commented Sep 9, 2025

What does this PR do?

Type of change: Minor code change

Overview: Fixed the CI/CD for diffusers

Usage

python quantize.py --model flux-schnell --override-model-path hf-internal-testing/tiny-flux-pipe --model-dtype BFloat16 --calib-size 8 --percentile 1.0 --alpha 0.8 --n-steps 20 --batch-size 2 --format int8 --collect-method min-mean --quant-algo smoothquant --trt-high-precision-dtype BFloat16 --quantized-torch-ckpt-save-path  ./flux-schnell_int8.pt --onnx-dir ./flux-schnell_int8_onnx

python quantize.py --model flux-schnell --override-model-path hf-internal-testing/tiny-flux-pipe --model-dtype BFloat16 --calib-size 8 --percentile 1.0 --alpha 0.8 --n-steps 20 --batch-size 2 --format int8 --collect-method min-mean --quant-algo smoothquant --trt-high-precision-dtype BFloat16 --restore-from  ./flux-schnell_int8.pt --onnx-dir ./flux-schnell_int8_onnx

python diffusion_trt.py --model flux-schnell --override-model-path hf-internal-testing/tiny-flux-pipe --model-dtype BFloat16 --onnx-load-path ./flux-schnell_int8_onnx/model.onnx --dq-only

Testing

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes/No
  • Did you write any new necessary tests?: Yes/No
  • Did you add or update any necessary documentation?: Yes/No
  • Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

  • New Features

    • Added ability to override the model path when creating pipelines in the diffusion quantization examples (supports the existing --override-model-path CLI option), allowing users to load custom or local models.
  • Chores

    • Updated example dependencies to include diffusers 0.34.0 for improved compatibility and stability.

@jingyu-ml jingyu-ml self-assigned this Sep 9, 2025
@jingyu-ml jingyu-ml requested a review from a team as a code owner September 9, 2025 21:40
Copy link

copy-pr-bot bot commented Sep 9, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

coderabbitai bot commented Sep 9, 2025

Warning

Rate limit exceeded

@jingyu-ml has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 38 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0df43a5 and 43ed09f.

📒 Files selected for processing (3)
  • examples/diffusers/quantization/diffusion_trt.py (1 hunks)
  • examples/diffusers/quantization/quantize.py (2 hunks)
  • examples/diffusers/quantization/requirements.txt (1 hunks)

Walkthrough

Adds an optional override_model_path parameter to PipelineManager.create_pipeline_from and threads it from diffusion_trt.py; model_id resolution prefers the override when present. Also adds diffusers==0.34.0 to the quantization example requirements. No public API removals.

Changes

Cohort / File(s) Summary
Pipeline manager & quantization CLI
examples/diffusers/quantization/quantize.py
Add optional `override_model_path: str
TRT example CLI
examples/diffusers/quantization/diffusion_trt.py
Thread args.override_model_path into the call to PipelineManager.create_pipeline_from (pass override_model_path=...).
Example requirements
examples/diffusers/quantization/requirements.txt
Add dependency diffusers==0.34.0.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as diffusion_trt.py
  participant PM as PipelineManager.create_pipeline_from
  participant DP as Diffusers Pipeline
  Note over CLI,PM: CLI may receive --override-model-path

  User->>CLI: Run script (with optional --override-model-path)
  CLI->>PM: create_pipeline_from(model_type, dtype, override_model_path)
  alt override_model_path provided
    PM->>PM: model_id = override_model_path
  else no override
    PM->>PM: model_id = MODEL_REGISTRY[model_type]
  end
  PM->>DP: Instantiate pipeline with model_id and dtype
  DP-->>PM: Pipeline instance
  PM-->>CLI: Return pipeline
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks (1 passed, 2 warnings)

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The current title “Fixed the CICD for diffusers” does not reflect the actual changes in this pull request, which primarily add an override_model_path parameter to the example scripts and update dependencies rather than addressing CI/CD workflows. Therefore, it is misleading and fails to summarize the main content of the diff. Please rename the pull request to succinctly describe the primary change, for example “Add override_model_path support to diffusion_trt and quantize examples” so that the title clearly matches the modifications made.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Poem

A rabbit nudges paths at night,
Swaps registry for chosen light.
Pipelines hum where overrides play,
Diffusers pinned to guide the way.
Hops of joy — a nimble byte. 🥕

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jingyux/fixed-trtexec-cicd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jingyu-ml jingyu-ml force-pushed the jingyux/fixed-trtexec-cicd branch from 7b291bf to 5d4befc Compare September 9, 2025 21:45
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
examples/diffusers/quantization/quantize.py (1)

938-943: Bug: always passing class attribute for quantize_mha (ignores CLI flag).

QuantizationConfig.quantize_mha is a class attribute default; this will always be False and won’t reflect the user’s choice. Use the instance value from quant_config.

Apply this diff:

         export_manager.export_onnx(
             pipe,
             backbone,
             model_config.model_type,
             quant_config.format,
-            quantize_mha=QuantizationConfig.quantize_mha,
+            quantize_mha=quant_config.quantize_mha,
         )
🧹 Nitpick comments (3)
examples/diffusers/quantization/quantize.py (2)

326-328: Guard against empty-string override_model_path.

Treating "" as “no override” avoids surprising failures when an empty value gets wired in.

Apply this diff:

-            model_id = (
-                MODEL_REGISTRY[model_type] if override_model_path is None else override_model_path
-            )
+            model_id = (
+                MODEL_REGISTRY[model_type] if not override_model_path else override_model_path
+            )

342-343: Redundant try/except that just re-raises.

Catching and re-raising loses context in logs and adds noise.

Apply this diff to simplify:

-        try:
+        try:
             model_id = (
                 MODEL_REGISTRY[model_type] if override_model_path is None else override_model_path
             )
             ...
             return pipe
-        except Exception as e:
-            raise e
+        except Exception:
+            raise
examples/diffusers/quantization/diffusion_trt.py (1)

108-112: Minor readability: avoid mixing positional and keyword args.

Using keywords for all args improves clarity and future-proofing.

Apply this diff:

-    pipe = PipelineManager.create_pipeline_from(
-        MODEL_ID[args.model],
-        dtype_map[args.model_dtype],
-        override_model_path=args.override_model_path,
-    )
+    pipe = PipelineManager.create_pipeline_from(
+        model_type=MODEL_ID[args.model],
+        torch_dtype=dtype_map[args.model_dtype],
+        override_model_path=args.override_model_path,
+    )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6d2e75 and 7b291bf.

📒 Files selected for processing (3)
  • examples/diffusers/quantization/diffusion_trt.py (1 hunks)
  • examples/diffusers/quantization/quantize.py (2 hunks)
  • examples/diffusers/quantization/requirements.txt (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
examples/diffusers/quantization/diffusion_trt.py (1)
examples/diffusers/quantization/quantize.py (2)
  • PipelineManager (294-433)
  • create_pipeline_from (311-343)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (3)
examples/diffusers/quantization/requirements.txt (1)

2-2: Confirm CI installs transformers & accelerate alongside diffusers
diffusers 0.34.0 includes FluxPipeline and StableDiffusion3Pipeline, and does not strictly pin transformers/accelerate versions—ensure your CI setup installs or upgrades both packages per the official installation guide to avoid runtime errors.

examples/diffusers/quantization/quantize.py (1)

312-315: New override_model_path parameter — LGTM.

Signature and typing are appropriate; matches downstream usage.

examples/diffusers/quantization/diffusion_trt.py (1)

108-112: Plumbing override_model_path through — LGTM.

The call matches the updated signature; types align.

Copy link

codecov bot commented Sep 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.87%. Comparing base (4716131) to head (43ed09f).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #312   +/-   ##
=======================================
  Coverage   73.87%   73.87%           
=======================================
  Files         172      172           
  Lines       17439    17439           
=======================================
  Hits        12883    12883           
  Misses       4556     4556           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jingyu-ml jingyu-ml enabled auto-merge (squash) September 10, 2025 00:25
@@ -1,4 +1,5 @@
cuda-python
diffusers==0.34.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this specific version only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diffusers changed its layer definition. Until we find a solution, let’s downgrade to the previous version.
#262

@kevalmorabia97
Copy link
Collaborator

@jingyu-ml commits are not signed hence merging will be blocked

@jingyu-ml jingyu-ml force-pushed the jingyux/fixed-trtexec-cicd branch 3 times, most recently from 31c84a8 to 4bc7ed7 Compare September 10, 2025 20:54
@jingyu-ml jingyu-ml requested review from a team as code owners September 10, 2025 20:54
@jingyu-ml jingyu-ml marked this pull request as draft September 10, 2025 20:54
Copy link

copy-pr-bot bot commented Sep 10, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jingyu-ml jingyu-ml force-pushed the jingyux/fixed-trtexec-cicd branch from 4bc7ed7 to b353787 Compare September 10, 2025 20:57
@jingyu-ml jingyu-ml removed request for a team, gcunhase, i-riyad and ynankani September 10, 2025 20:57
@jingyu-ml jingyu-ml marked this pull request as ready for review September 10, 2025 20:58
jingyu-ml and others added 3 commits September 10, 2025 21:03
@jingyu-ml jingyu-ml force-pushed the jingyux/fixed-trtexec-cicd branch from 2b03e6c to 43ed09f Compare September 10, 2025 21:03
@jingyu-ml jingyu-ml merged commit 358b0c6 into main Sep 10, 2025
22 checks passed
@jingyu-ml jingyu-ml deleted the jingyux/fixed-trtexec-cicd branch September 10, 2025 22:03
benchislett pushed a commit that referenced this pull request Sep 15, 2025
Signed-off-by: jingyu <[email protected]>
Signed-off-by: Jingyu Xin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants