Skip to content

Conversation

Edwardf0t1
Copy link
Contributor

@Edwardf0t1 Edwardf0t1 commented Sep 4, 2025

What does this PR do?

Type of change: ? Minor code change

Overview: Make vlm detection more robust in ptq workflow

Usage

python hf_ptq.py --pyt_ckpt_path Qwen/Qwen2.5-VL-7B-Instruct --qformat nvfp4 --export_fmt hf --export_path <exported_Qwen2.5-VL-7B-Instruct-NVFP4>

Testing

Additional Information

Summary by CodeRabbit

  • New Features

    • Automatic multimodal (vision-language) model detection during export; when detected, the original model config is saved and processor/preprocessor config save is attempted with clear success or warning logs.
  • Refactor

    • More robust multimodal detection logic and clearer export flow; non-multimodal export/quantization behavior remains unchanged.
  • Chores

    • Detection utility added to the public API for reuse.

@Edwardf0t1 Edwardf0t1 requested a review from a team as a code owner September 4, 2025 18:41
@Edwardf0t1 Edwardf0t1 requested a review from realAsma September 4, 2025 18:41
Copy link

copy-pr-bot bot commented Sep 4, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

coderabbitai bot commented Sep 4, 2025

Walkthrough

Adds a multimodal-model detection utility and updates the PTQ example to use it: when a model is detected as VLM, the script saves the original model config and attempts to save an AutoProcessor config (guarded); non-VLM export/quantization flow unchanged.

Changes

Cohort / File(s) Summary
VLM-aware export in PTQ example
examples/llm_ptq/hf_ptq.py
- Added imports: AutoConfig, AutoProcessor from transformers and is_multimodal_model from modelopt.torch.export.model_utils.
- Replaced previous single-condition VLM check with is_vlm = is_multimodal_model(full_model).
- If is_vlm true, saves original model config via AutoConfig.from_pretrained(...).save_pretrained(export_path) and prints a dedicated message.
- Attempts to save processor via AutoProcessor.from_pretrained(...).save_pretrained(export_path) inside a try/except; on exception logs a warning and continues.
- Non-VLM export/quantization flow remains unchanged.
Multimodal detection utility
modelopt/torch/export/model_utils.py
- Added new public function is_multimodal_model(model) implementing heuristics to detect multimodal/VLM models (checks config.vision_config, model.language_model attribute, config.model_type == "phi4mm", config.vision_lora, config.audio_processor, config.embd_layer.image_embd_layer).
- Updated public API: __all__ changed from ["get_model_type"] to ["get_model_type", "is_multimodal_model"].
- Existing get_model_type left unchanged.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant S as hf_ptq.py
  participant M as model_utils.is_multimodal_model
  participant T as Transformers (AutoConfig/AutoProcessor)
  participant FS as Filesystem

  U->>S: Run PTQ export script
  S->>S: Load model (full_model) and config
  S->>M: is_multimodal_model(full_model)
  M-->>S: True / False
  alt is_vlm == true
    S->>T: AutoConfig.from_pretrained(model_id)
    T-->>S: AutoConfig
    S->>FS: Save model config (save_pretrained)
    S->>T: AutoProcessor.from_pretrained(model_id) (try)
    alt processor load success
      T-->>S: AutoProcessor
      S->>FS: Save processor config (save_pretrained)
    else processor load fails
      T-->>S: Exception
      S->>S: Log warning and continue
    end
  else is_vlm == false
    Note over S: Proceed with existing non-VLM export/quantization flow
  end
  S-->>U: Complete export
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my nose at configs found,
I sniff for sights and sounds around.
I save the map, I try the pack,
If processors hide, I hop right back.
A carrot cheer for careful tracks 🥕🐇

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch zhiyu/update-export-config-vlm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
examples/llm_ptq/hf_ptq.py (2)

572-581: VLM detection is better; broaden heuristics slightly and avoid broad try/except.

Some VLMs expose vision_tower rather than vision_config. You can avoid exceptions via safe getattr and include a couple more cheap signals (without making it brittle).

-# Check for VLMs by looking for vision_config in model config or language_model attribute
-is_vlm = False
-try:
-    is_vlm = hasattr(full_model.config, "vision_config") or hasattr(
-        full_model, "language_model"
-    )
-except Exception:
-    # Fallback to the original check if config access fails
-    is_vlm = hasattr(full_model, "language_model")
+# Check for VLMs via a few robust signals
+cfg = getattr(full_model, "config", None)
+is_vlm = any(
+    [
+        hasattr(full_model, "language_model"),
+        hasattr(full_model, "vision_tower"),
+        hasattr(cfg, "vision_config") if cfg is not None else False,
+        hasattr(cfg, "vision_tower") if cfg is not None else False,
+        ("vl" in (model_type or "")) or (model_type in {"mllama", "phi4mm"}),
+    ]
+)

582-595: Saving configs for VLMs: confirm intent for TensorRT-LLM exports; optionally gate by export_fmt and reuse existing processor.

Mixing HF artifacts into a TensorRT-LLM export directory can be fine, but double-check it won’t confuse downstream tooling. Also, if a processor is already loaded (e.g., mllama/whisper), reuse it to avoid reloading.

  • Confirm that placing config.json/preprocessor_config.json in a TRT-LLM export dir doesn’t interfere with trtllm tooling.
  • If undesired, gate the save under args.export_fmt == "hf".
-if is_vlm:
+if is_vlm:
     # Save original model config and the preprocessor config to the export path for VLMs.
-
-    print(f"Saving original model and processor configs to {export_path}")
-
-    AutoConfig.from_pretrained(
-        args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-    ).save_pretrained(export_path)
-
-    AutoProcessor.from_pretrained(
-        args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-    ).save_pretrained(export_path)
+    if args.export_fmt == "hf":
+        print(f"Saving original model and processor configs to {export_path}")
+        # Lazy import to avoid hard dependency at module import time
+        from transformers import AutoConfig, AutoProcessor
+        AutoConfig.from_pretrained(
+            args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+        ).save_pretrained(export_path)
+        # Prefer the already-loaded processor if available
+        proc = processor
+        if proc is None:
+            proc = AutoProcessor.from_pretrained(
+                args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+            )
+        proc.save_pretrained(export_path)

Note that saving both an AutoProcessor and a tokenizer later may duplicate tokenizer artifacts; it’s harmless but a tiny bit noisy. If desired, skip tokenizer.save_pretrained when is_vlm and a processor was saved.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d0372f4 and 0b2ec22.

📒 Files selected for processing (1)
  • examples/llm_ptq/hf_ptq.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (1)
examples/llm_ptq/hf_ptq.py (1)

27-34: AutoProcessor import safe under pinned transformers>=4.48
setup.py enforces “transformers>=4.48,<5.0” and AutoProcessor was introduced in v4.13.0, so the top‐level import won’t fail—lazy import not required. Ensure example docs/install scripts install transformers>=4.48.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
examples/llm_ptq/hf_ptq.py (1)

594-602: Narrow the exception scope and prefer saving only the image processor when present.

Aligns with unified export behavior and answers the earlier “why try/except” concern: some models don’t provide AutoProcessor. Catch only expected exceptions.

-            try:
-                print(f"Saving processor config to {export_path}")
-                AutoProcessor.from_pretrained(
-                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-                ).save_pretrained(export_path)
-            except Exception as e:
-                print(f"Warning: Could not save processor config: {e}")
-                print("This is normal for some VLM architectures that don't use AutoProcessor")
+            try:
+                processor = AutoProcessor.from_pretrained(
+                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+                )
+                if hasattr(processor, "image_processor"):
+                    print(f"Saving image processor config to {export_path}")
+                    processor.image_processor.save_pretrained(export_path)
+                else:
+                    print(f"Saving processor config to {export_path}")
+                    processor.save_pretrained(export_path)
+            except (OSError, ValueError, ImportError) as e:
+                warnings.warn(
+                    f"Could not save processor config: {e}. Some multimodal architectures don't provide AutoProcessor."
+                )
🧹 Nitpick comments (2)
examples/llm_ptq/hf_ptq.py (2)

572-583: Tighten the VLM detection heuristic to avoid false positives and cover common patterns.

Suggestion:

  • Add a check for hasattr(full_model, "vision_model") (common in many VLM wrappers).
  • Drop hasattr(config, "audio_processor") — processors aren’t typically stored on config and may yield false positives.

Apply:

-        is_vlm = (
-            hasattr(config, "vision_config")  # Standard vision config (e.g., Qwen2.5-VL)
-            or hasattr(full_model, "language_model")  # Language model attribute (e.g., LLaVA)
-            or getattr(config, "model_type", "") == "phi4mm"  # Phi-4 multimodal
-            or hasattr(config, "vision_lora")  # Vision LoRA configurations
-            or hasattr(config, "audio_processor")  # Audio processing capabilities
-            or (
-                hasattr(config, "embd_layer") and hasattr(config.embd_layer, "image_embd_layer")
-            )  # Image embedding layers
-        )
+        is_vlm = (
+            hasattr(config, "vision_config")  # Standard vision config (e.g., Qwen2.5-VL)
+            or hasattr(full_model, "language_model")  # Language model attribute (e.g., LLaVA)
+            or hasattr(full_model, "vision_model")  # Explicit vision backbone on the model
+            or getattr(config, "model_type", "") == "phi4mm"  # Phi-4 multimodal
+            or hasattr(config, "vision_lora")  # Vision LoRA configurations
+            or (
+                hasattr(config, "embd_layer") and hasattr(config.embd_layer, "image_embd_layer")
+            )  # Image embedding layers
+        )

589-591: Prefer saving the in-memory config to avoid an extra network call and preserve local tweaks.

AutoConfig.from_pretrained may re-download; saving full_model.config is faster and captures any in-process edits.

-            AutoConfig.from_pretrained(
-                args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-            ).save_pretrained(export_path)
+            full_model.config.save_pretrained(export_path)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0b2ec22 and d36dca7.

📒 Files selected for processing (1)
  • examples/llm_ptq/hf_ptq.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
examples/llm_ptq/hf_ptq.py (1)
modelopt/torch/export/unified_export_megatron.py (1)
  • save_pretrained (298-462)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (2)
examples/llm_ptq/hf_ptq.py (2)

28-31: Add AutoConfig/AutoProcessor imports — OK.

These are required for saving configs/processors later. No issues.


577-577: Phi-4 MM special-casing looks good.

Explicitly checking model_type == "phi4mm" addresses the earlier concern that not all VLMs expose vision_config.

@Edwardf0t1 Edwardf0t1 self-assigned this Sep 4, 2025
@Edwardf0t1 Edwardf0t1 requested a review from a team as a code owner September 5, 2025 00:22
@Edwardf0t1 Edwardf0t1 requested a review from cjluo-nv September 5, 2025 00:22
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
modelopt/torch/export/model_utils.py (1)

74-111: Simplify model_type check and guard against missing configs

  • Change signature to def is_multimodal_model(model) -> bool and retrieve config with config = getattr(model, "config", None); return False if config is None.
  • Normalize model_type with mt = str(getattr(config, "model_type", "")).lower() and compare directly to "phi4mm" (the canonical value) instead of a broader alias set.
  • Handle embd_layer being either an object or dict:
    embd_layer = getattr(config, "embd_layer", None)
    has_image_emb = (
        hasattr(embd_layer, "image_embd_layer")
        or (isinstance(embd_layer, dict) and "image_embd_layer" in embd_layer)
    )
  • Ensure attribute-based heuristics verify that attributes aren’t just present but non-None (e.g., getattr(config, "vision_config", None) is not None).
  • Retain checks for vision_tower, vision_model, multi_modal_projector/mm_projector, language_model, vision_lora, and audio_processor to cover other multimodal cases.
examples/llm_ptq/hf_ptq.py (1)

573-593: Make config/processor export resilient and align logging.

Avoid hard failures when offline/missing processor and keep messages consistent with warnings.warn. Also rename local flag for clarity.

Apply:

-        # Check if the model is a multimodal/VLM model
-        is_vlm = is_multimodal_model(full_model)
+        # Check if the model is a multimodal (VLM or audio-text) model
+        is_multimodal = is_multimodal_model(full_model)
 
-        if is_vlm:
+        if is_multimodal:
             # Save original model config and the processor config to the export path for VLMs.
-            print(f"Saving original model config to {export_path}")
-
-            AutoConfig.from_pretrained(
-                args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-            ).save_pretrained(export_path)
+            try:
+                print(f"Saving original model config to {export_path}")
+                AutoConfig.from_pretrained(
+                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+                ).save_pretrained(export_path)
+            except (OSError, ValueError) as e:
+                warnings.warn(f"Could not save original model config: {e}")
 
             # Try to save processor config if available
-            try:
-                print(f"Saving processor config to {export_path}")
-                AutoProcessor.from_pretrained(
-                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-                ).save_pretrained(export_path)
-            except Exception as e:
-                print(f"Warning: Could not save processor config: {e}")
-                print("This is normal for some VLM architectures that don't use AutoProcessor")
+            try:
+                proc = AutoProcessor.from_pretrained(
+                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+                )
+                print(f"Saving processor config to {export_path}")
+                proc.save_pretrained(export_path)
+            except (OSError, ValueError, ImportError) as e:
+                warnings.warn(
+                    f"Could not save processor config: {e}. Some multimodal architectures don't use AutoProcessor."
+                )

If you prefer to keep the is_vlm name, consider updating user-facing prints to say “multimodal” for audio-text models (e.g., Whisper).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d36dca7 and befce1e.

📒 Files selected for processing (2)
  • examples/llm_ptq/hf_ptq.py (3 hunks)
  • modelopt/torch/export/model_utils.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
examples/llm_ptq/hf_ptq.py (2)
modelopt/torch/export/model_utils.py (1)
  • is_multimodal_model (74-111)
modelopt/torch/export/unified_export_megatron.py (1)
  • save_pretrained (298-462)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: linux
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (3)
modelopt/torch/export/model_utils.py (1)

63-63: Public API export updated correctly.

Adding is_multimodal_model to __all__ is appropriate.

examples/llm_ptq/hf_ptq.py (2)

28-31: Imports for HF config/processor look good.

Needed for exporting original configs for multimodal models.


44-45: Centralized detection via util is a good move.

Using is_multimodal_model avoids scattering heuristics.

@Edwardf0t1 Edwardf0t1 force-pushed the zhiyu/update-export-config-vlm branch from befce1e to 1fd1154 Compare September 5, 2025 00:34
Copy link

codecov bot commented Sep 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.95%. Comparing base (29b4cf2) to head (2e75ee3).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #286   +/-   ##
=======================================
  Coverage   73.94%   73.95%           
=======================================
  Files         172      172           
  Lines       17405    17405           
=======================================
+ Hits        12870    12871    +1     
+ Misses       4535     4534    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Edwardf0t1 Edwardf0t1 force-pushed the zhiyu/update-export-config-vlm branch from 1fd1154 to 2e75ee3 Compare September 5, 2025 01:15
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
examples/llm_ptq/hf_ptq.py (2)

576-583: Avoid clobbering HF export config; write original config to a subdir (or gate by export_fmt).

When export_fmt == "hf", export_hf_checkpoint will likely write its own config.json, overwriting the “original” one just saved here. Save originals under a subdirectory to preserve both, or only save originals for the TRT-LLM path.

Apply:

-        if is_vlm:
-            # Save original model config and the processor config to the export path for VLMs.
-            print(f"Saving original model config to {export_path}")
+        if is_vlm:
+            # Preserve the original HF configs without clobbering HF export outputs.
+            cfg_dir = export_path if args.export_fmt == "tensorrt_llm" else f"{export_path}/original_hf_config"
+            print(f"Saving original model config to {cfg_dir}")
 
-            AutoConfig.from_pretrained(
-                args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-            ).save_pretrained(export_path)
+            AutoConfig.from_pretrained(
+                args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+            ).save_pretrained(cfg_dir)

584-593: Narrow the exception scope and prefer warnings for non-fatal save failures.

Catching Exception is broad. Align with other exporters and only catch expected errors; emit a warning.

Apply:

-            # Try to save processor config if available
-            try:
-                print(f"Saving processor config to {export_path}")
-                AutoProcessor.from_pretrained(
-                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
-                ).save_pretrained(export_path)
-            except Exception as e:
-                print(f"Warning: Could not save processor config: {e}")
-                print("This is normal for some VLM architectures that don't use AutoProcessor")
+            # Try to save processor config if available
+            try:
+                print(f"Saving processor config to {cfg_dir}")
+                AutoProcessor.from_pretrained(
+                    args.pyt_ckpt_path, trust_remote_code=args.trust_remote_code
+                ).save_pretrained(cfg_dir)
+            except (OSError, ValueError, ImportError) as e:
+                warnings.warn(
+                    f"Could not save processor config: {e}. "
+                    "Some VLM architectures don't use AutoProcessor."
+                )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1fd1154 and 2e75ee3.

📒 Files selected for processing (2)
  • examples/llm_ptq/hf_ptq.py (3 hunks)
  • modelopt/torch/export/model_utils.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • modelopt/torch/export/model_utils.py
🧰 Additional context used
🧬 Code graph analysis (1)
examples/llm_ptq/hf_ptq.py (2)
modelopt/torch/export/model_utils.py (1)
  • is_multimodal_model (74-111)
modelopt/torch/export/unified_export_megatron.py (1)
  • save_pretrained (298-462)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (3)
examples/llm_ptq/hf_ptq.py (3)

27-34: Imports for AutoConfig/AutoProcessor look good.

These are used in the new VLM handling block; no issues.


44-45: Good move: centralize VLM detection via util.

Using is_multimodal_model aligns with earlier feedback to factor this out and should reduce false negatives across VLM families.


573-575: Correct detection scope.

Checking full_model (not the possibly narrowed model) is the right choice for robust VLM detection.

@Edwardf0t1 Edwardf0t1 merged commit 78008ac into main Sep 5, 2025
19 of 20 checks passed
@Edwardf0t1 Edwardf0t1 deleted the zhiyu/update-export-config-vlm branch September 5, 2025 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants