-
Notifications
You must be signed in to change notification settings - Fork 190
Preserve original rope scaling type in export due to transformers library AutoConfig issue #452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a unified model path resolver and config fetch utilities, uses them in examples, exposes model_path_utils via torch utils, stores original checkpoint path in the low-memory PTQ load, and restores original Changes
Sequence Diagram(s)sequenceDiagram
participant Example as Example CLI
participant Resolver as ModelPathResolver / resolve_model_path
participant Local as Local FS / HF Hub
Example->>Resolver: _resolve_model_path(model_ref, download_files=True,...)
alt model_ref is local dir
Resolver->>Local: return local path
else model_ref looks like HF ID
Resolver->>Local: attempt AutoConfig/ hf_hub_download (allow_patterns)
alt download success
Local-->>Resolver: files/config path
else fallback
Resolver-->>Example: warn and return input or raise (depending on flags)
end
end
sequenceDiagram
participant PTQ as PTQ Loader
participant Model as Model
participant Export as HF Exporter
participant Fetch as fetch_model_config / Orig FS
participant Config as Export Config
PTQ->>Model: load quantized fallback (low_memory_mode)
PTQ->>Model: set _original_model_path
Export->>Model: check _original_model_path
alt original path present
Export->>Fetch: fetch original config.json (local or HF)
Fetch-->>Export: original config
Export->>Config: restore_original_rope_scaling(config_data, path)
Config->>Config: replace rope_scaling / text_config.rope_scaling if conditions met
else no original path
Note over Export: skip restoration
end
Export->>Config: write exported config
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…rary AutoConfig issue Signed-off-by: Zhiyu Cheng <[email protected]>
489e6db to
0c4194e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
examples/llm_ptq/hf_ptq.py (2)
252-253: Consider extracting duplicated logic into a helper function.The same logic for storing
_original_model_pathappears in both the regular and low-memory code paths. Consider extracting this into a helper function to reduce duplication and ensure consistency.+def _store_original_model_path(model, pyt_ckpt_path): + """Store original model path for config restoration during export.""" + model._original_model_path = pyt_ckpt_path + def main(args): ... if not args.low_memory_mode: model = get_model(...) - # Store original model path for config restoration - model._original_model_path = args.pyt_ckpt_path + _store_original_model_path(model, args.pyt_ckpt_path) else: ... model = AutoModelForCausalLM.from_pretrained(...) - # Store original model path for config restoration - model._original_model_path = args.pyt_ckpt_path + _store_original_model_path(model, args.pyt_ckpt_path)Also applies to: 275-276
252-253: Consider using_modelopt_original_pathinstead of_original_model_pathto avoid convention conflicts.HuggingFace's PretrainedConfig explicitly avoids storing attributes with underscore-prefixed keys (private attributes), reflecting their design convention. While the current implementation works (the attribute is only stored on the model object, not persisted to config, and is read defensively with
getattr()), adopting a namespaced prefix like_modelopt_original_pathwould align better with HuggingFace conventions and reduce fragility. Update both assignment locations (lines 253 and 276 inexamples/llm_ptq/hf_ptq.py) and the read location (modelopt/torch/export/unified_export_hf.py:546).modelopt/torch/export/model_config_utils.py (4)
230-275: Clarify the in-place modification + return value pattern.The function modifies
config_datain-place (lines 263, 271) but also returns it (line 275). The docstring states "modified in-place" on line 242, which makes the return value redundant. This pattern can be confusing for callers.Consider either:
- Modifying in-place only and returning
None- Creating and returning a modified copy without mutating the input
- Clarifying in the docstring that both behaviors are intentional
244-246: Consider moving imports to module level.While importing inside the function works, it's unconventional and may cause slight performance overhead on repeated calls. Since these are standard library modules (json, warnings, pathlib), consider moving them to the module-level imports.
Apply this diff:
"""Common utils for the ModelConfig.""" import dataclasses +import json import math +import warnings from types import UnionType from typing import Union, get_args, get_origin +from pathlib import PathThen remove the imports from inside the function:
def restore_original_rope_scaling(config_data: dict, original_model_path: str) -> dict: """...""" - import json - import warnings - from pathlib import Path - try:
254-261: Document the rope_scaling modification detection heuristic.The condition on line 260 checks for
"rope_type" in config_data["rope_scaling"]as part of detecting the transformers library's modification. This heuristic could benefit from a comment explaining why the presence ofrope_typeindicates that transformers has modified the config.# Check if rope_scaling was modified from mrope to default if ( "rope_scaling" in raw_original_config and "rope_scaling" in config_data and raw_original_config["rope_scaling"].get("type") == "mrope" and config_data["rope_scaling"].get("type") == "default" + # rope_type field is added by transformers when converting mrope to default and "rope_type" in config_data["rope_scaling"] ):
248-271: Consider adding debug logging for restoration decisions.The function silently does nothing if the rope_scaling modification conditions aren't met. Adding optional debug logging would help users understand when and why restoration is skipped, aiding troubleshooting.
if original_config_file.exists(): with open(original_config_file) as f: raw_original_config = json.load(f) # Check if rope_scaling was modified from mrope to default if ( "rope_scaling" in raw_original_config and "rope_scaling" in config_data and raw_original_config["rope_scaling"].get("type") == "mrope" and config_data["rope_scaling"].get("type") == "default" and "rope_type" in config_data["rope_scaling"] ): print(f"Restoring original rope_scaling configuration from {original_model_path}") config_data["rope_scaling"] = raw_original_config["rope_scaling"] # Also restore rope_scaling in text_config if it exists if ( "text_config" in config_data and "rope_scaling" in config_data["text_config"] and config_data["text_config"]["rope_scaling"].get("type") == "default" ): config_data["text_config"]["rope_scaling"] = raw_original_config["rope_scaling"] + else: + # Optionally log when conditions aren't met for debugging + pass + else: + warnings.warn(f"Original config file not found at {original_config_file}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
examples/llm_ptq/hf_ptq.py(2 hunks)modelopt/torch/export/model_config_utils.py(1 hunks)modelopt/torch/export/unified_export_hf.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
modelopt/torch/export/unified_export_hf.py (1)
modelopt/torch/export/model_config_utils.py (1)
restore_original_rope_scaling(230-275)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: wait-checks / wait
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (2)
modelopt/torch/export/unified_export_hf.py (2)
57-57: LGTM - Clean import of restoration utility.The import statement properly brings in the rope_scaling restoration function from the utils module.
545-548: LGTM - Proper defensive coding with attribute check.The code correctly uses
getattrwith a default ofNoneto safely check for the_original_model_pathattribute before attempting restoration. The placement before writing the config is appropriate.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #452 +/- ##
==========================================
- Coverage 73.37% 73.07% -0.31%
==========================================
Files 180 181 +1
Lines 17937 18096 +159
==========================================
+ Hits 13162 13223 +61
- Misses 4775 4873 +98 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| import json | ||
| import warnings | ||
| from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can move these imports to the top
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
examples/llm_ptq/hf_ptq.py
Outdated
| # Store original model path for config restoration | ||
| model._original_model_path = args.pyt_ckpt_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move this above if-else instead of having it in both if and else conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, moved it below if-else.
| import warnings | ||
| from pathlib import Path | ||
|
|
||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just do a copy of the config.json and keep it the same as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the current solution as we can't keep it the same because we need to add the quantization config into config.json.
Signed-off-by: Zhiyu Cheng <[email protected]>
Signed-off-by: Zhiyu Cheng <[email protected]>
Signed-off-by: Zhiyu Cheng <[email protected]>
Signed-off-by: Zhiyu Cheng <[email protected]>
Signed-off-by: Zhiyu Cheng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
examples/llm_ptq/example_utils.py (2)
35-35: Import of unified resolver is good; drop now-unused snapshot_download.Given
_resolve_model_pathdelegates toresolve_model_path, the earliersnapshot_downloadimport block (Lines 29-33) appears unused; remove it to avoid dead code.
262-276: Wrapper delegation LGTM; phasing out is minimal effort.Only one call site exists at line 295, so inlining
resolve_model_pathdirectly and removing the wrapper would require minimal refactoring. Keep the wrapper for now for backward-compat, but consider this a straightforward cleanup task when convenient.modelopt/torch/utils/model_path_utils.py (2)
172-209: Snapshot selection by lexicographic order can pick the wrong revision.Use directory mtime to select the most-recent snapshot instead of
max()on names.Apply this diff:
- if snapshot_dirs: - latest_snapshot = max(snapshot_dirs) # Use lexicographically latest - snapshot_path = os.path.join(snapshots_dir, latest_snapshot) + if snapshot_dirs: + latest_snapshot = max( + snapshot_dirs, + key=lambda d: os.path.getmtime(os.path.join(snapshots_dir, d)) + ) + snapshot_path = os.path.join(snapshots_dir, latest_snapshot) return snapshot_path
211-265: fetch_model_config is a good building block; minor nit.
trust_remote_codeisn’t used here, which is fine forconfig.json. Consider removing the param or documenting it as ignored.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/llm_ptq/example_utils.py(3 hunks)modelopt/torch/export/model_config_utils.py(2 hunks)modelopt/torch/utils/__init__.py(1 hunks)modelopt/torch/utils/model_path_utils.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
examples/llm_ptq/example_utils.py (1)
modelopt/torch/utils/model_path_utils.py (1)
resolve_model_path(98-208)
modelopt/torch/export/model_config_utils.py (1)
modelopt/torch/utils/model_path_utils.py (3)
fetch_model_config(211-264)is_huggingface_model_id(51-95)is_huggingface_model_id(313-317)
modelopt/torch/utils/model_path_utils.py (1)
examples/diffusers/quantization/quantize.py (1)
model_path(222-226)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (3)
modelopt/torch/utils/model_path_utils.py (2)
98-171: Resolver flow is solid; guardrails and fallbacks are appropriate.Nice sequence: local → AutoConfig (cache trigger) → snapshot_download → cache scan → warn. No action.
51-96: Review comment verified as accurate; no changes required.The function logic has been confirmed by tracing through the code:
- "org/model" →
True(matches org/model format with exactly one slash)- "/abs/path" →
False(multiple slashes, treated as local path)- "./rel", "../rel" →
False(matches local path indicators)- "C:/win/path" →
False(matches C: indicator)- "org/model/sub" →
False(multiple slashes rejected)The review correctly approves the heuristic. The function properly distinguishes HuggingFace model IDs (org/model format) from local paths, and correctly rejects multi-slash identifiers like
org/model/subsince HuggingFace's subfolder handling is a separate parameter, not part of the model ID itself.modelopt/torch/utils/__init__.py (1)
25-25: No API collisions detected; re-export is safe.Verification confirms the 4 exported names (
ModelPathResolver,fetch_model_config,is_huggingface_model_id,resolve_model_path) do not collide with any names in preceding utils modules. The module explicitly defines__all__, ensuring only intended public names are exported. The wildcard import pattern aligns with the existing 11 other wildcard imports in the file.
| def restore_original_rope_scaling(config_data: dict, original_model_path: str) -> dict: | ||
| """Restore original rope_scaling configuration if it was modified by transformers. | ||
| Some VLM models like Qwen2.5-VL have their rope_scaling configuration modified | ||
| by the transformers library during loading (e.g., from "mrope" to "default" with | ||
| additional fields). This function restores the original configuration. | ||
| Args: | ||
| config_data: The model configuration dictionary to restore | ||
| original_model_path: Path to the original model directory or HuggingFace Hub model ID | ||
| (e.g., "microsoft/DialoGPT-medium" or "/path/to/local/model") | ||
| Returns: | ||
| The config_data dictionary with restored rope_scaling (modified in-place) | ||
| Note: | ||
| This function automatically detects whether original_model_path is a local filesystem | ||
| path or a HuggingFace Hub model ID. For Hub model IDs, it will fetch the config.json | ||
| directly from the Hub. Requires huggingface_hub package for Hub model ID support. | ||
| """ | ||
| try: | ||
| raw_original_config = None | ||
|
|
||
| # Check if original_model_path is a HuggingFace Hub model ID or local path | ||
| if is_huggingface_model_id(original_model_path): | ||
| # Try to fetch config from HuggingFace Hub | ||
| raw_original_config = fetch_model_config(original_model_path) | ||
| else: | ||
| # Handle as local filesystem path | ||
| original_config_file = Path(original_model_path) / "config.json" | ||
| if original_config_file.exists(): | ||
| with open(original_config_file) as f: | ||
| raw_original_config = json.load(f) | ||
|
|
||
| # If we successfully got the original config, proceed with restoration | ||
| if raw_original_config is not None: | ||
| # Check if rope_scaling was modified from mrope to default | ||
| if ( | ||
| "rope_scaling" in raw_original_config | ||
| and "rope_scaling" in config_data | ||
| and raw_original_config["rope_scaling"].get("type") == "mrope" | ||
| and config_data["rope_scaling"].get("type") == "default" | ||
| and "rope_type" in config_data["rope_scaling"] | ||
| ): | ||
| print(f"Restoring original rope_scaling configuration from {original_model_path}") | ||
| config_data["rope_scaling"] = raw_original_config["rope_scaling"] | ||
|
|
||
| # Also restore rope_scaling in text_config if it exists | ||
| if ( | ||
| "text_config" in config_data | ||
| and "rope_scaling" in config_data["text_config"] | ||
| and config_data["text_config"]["rope_scaling"].get("type") == "default" | ||
| ): | ||
| config_data["text_config"]["rope_scaling"] = raw_original_config["rope_scaling"] | ||
| elif is_huggingface_model_id(original_model_path): | ||
| # Log that we couldn't find the original config | ||
| warnings.warn( | ||
| f"Could not fetch original config from HuggingFace Hub: {original_model_path}" | ||
| ) | ||
| else: | ||
| # Only warn if the local path was expected to exist | ||
| original_config_file = Path(original_model_path) / "config.json" | ||
| if not original_config_file.exists(): | ||
| warnings.warn(f"Original config file not found: {original_config_file}") | ||
| except Exception as e: | ||
| warnings.warn(f"Could not restore original rope_scaling configuration: {e}") | ||
|
|
||
| return config_data | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Restore logic misses text_config-only rope_scaling; unify fetch and avoid print.
If the original config has rope_scaling only under text_config (common in VLMs), current logic won’t restore it because it only checks raw_original_config["rope_scaling"]. Also, prefer using fetch_model_config for both Hub/local and avoid print in library code.
Apply this diff to make restoration robust and logging consistent:
@@
-def restore_original_rope_scaling(config_data: dict, original_model_path: str) -> dict:
+def restore_original_rope_scaling(config_data: dict, original_model_path: str) -> dict:
@@
- try:
- raw_original_config = None
-
- # Check if original_model_path is a HuggingFace Hub model ID or local path
- if is_huggingface_model_id(original_model_path):
- # Try to fetch config from HuggingFace Hub
- raw_original_config = fetch_model_config(original_model_path)
- else:
- # Handle as local filesystem path
- original_config_file = Path(original_model_path) / "config.json"
- if original_config_file.exists():
- with open(original_config_file) as f:
- raw_original_config = json.load(f)
+ try:
+ # Always use unified fetcher (handles both Hub and local)
+ raw_original_config = fetch_model_config(original_model_path)
@@
- if raw_original_config is not None:
- # Check if rope_scaling was modified from mrope to default
- if (
- "rope_scaling" in raw_original_config
- and "rope_scaling" in config_data
- and raw_original_config["rope_scaling"].get("type") == "mrope"
- and config_data["rope_scaling"].get("type") == "default"
- and "rope_type" in config_data["rope_scaling"]
- ):
- print(f"Restoring original rope_scaling configuration from {original_model_path}")
- config_data["rope_scaling"] = raw_original_config["rope_scaling"]
-
- # Also restore rope_scaling in text_config if it exists
- if (
- "text_config" in config_data
- and "rope_scaling" in config_data["text_config"]
- and config_data["text_config"]["rope_scaling"].get("type") == "default"
- ):
- config_data["text_config"]["rope_scaling"] = raw_original_config["rope_scaling"]
- elif is_huggingface_model_id(original_model_path):
- # Log that we couldn't find the original config
- warnings.warn(
- f"Could not fetch original config from HuggingFace Hub: {original_model_path}"
- )
- else:
- # Only warn if the local path was expected to exist
- original_config_file = Path(original_model_path) / "config.json"
- if not original_config_file.exists():
- warnings.warn(f"Original config file not found: {original_config_file}")
+ if raw_original_config is not None:
+ # Original rope may live at top-level or under text_config
+ orig_rope = (
+ raw_original_config.get("rope_scaling")
+ or raw_original_config.get("text_config", {}).get("rope_scaling")
+ )
+ if isinstance(orig_rope, dict) and orig_rope.get("type") == "mrope":
+ def _should_restore(cur: dict | None) -> bool:
+ return isinstance(cur, dict) and cur.get("type") == "default"
+
+ restored = False
+ # Top-level
+ if _should_restore(config_data.get("rope_scaling")):
+ config_data["rope_scaling"] = orig_rope
+ restored = True
+ # Nested text_config
+ if "text_config" in config_data and _should_restore(
+ config_data["text_config"].get("rope_scaling")
+ ):
+ config_data["text_config"]["rope_scaling"] = orig_rope
+ restored = True
+ if restored:
+ warnings.warn(
+ f"Restored original rope_scaling (mrope) from {original_model_path}"
+ )
+ else:
+ warnings.warn(f"Could not load original config for: {original_model_path}")
@@
- except Exception as e:
+ except Exception as e:
warnings.warn(f"Could not restore original rope_scaling configuration: {e}")🏁 Script executed:
#!/bin/bash
# 1) Ensure there is only one definition of restore_original_rope_scaling
rg -nP --type=py '^\s*def\s+restore_original_rope_scaling\s*\('
# 2) Grep for call sites to confirm both top-level and text_config cases may occur
rg -n 'text_config"|'\''text_config'\''|\brope_scaling\b' --type=json -g 'config.json' -S -C1 || trueLength of output: 284
I need to verify whether fetch_model_config() actually supports both Hub model IDs and local paths, and confirm the described logic gap.
🏁 Script executed:
#!/bin/bash
# 1) Find the definition of fetch_model_config
rg -nP --type=py '^\s*def\s+fetch_model_config\s*\(' -A 15
# 2) Check if current code uses print() anywhere
rg -n 'print\s*\(' modelopt/torch/export/model_config_utils.py
# 3) Check the imports at the top of the file to understand available helpers
head -50 modelopt/torch/export/model_config_utils.pyLength of output: 2873
🏁 Script executed:
#!/bin/bash
# 1) Get full implementation of fetch_model_config
rg -nP --type=py '^\s*def\s+fetch_model_config\s*\(' -A 50 modelopt/torch/utils/model_path_utils.py | head -70
# 2) Extract the exact restoration logic from restore_original_rope_scaling
sed -n '250,280p' modelopt/torch/export/model_config_utils.pyLength of output: 3560
Replace redundant fetch logic with unified call; fix logic gap to restore text_config-only rope_scaling; replace print with warning.
The review comment is correct. Verification confirms:
-
Logic gap exists: Current code only checks
"rope_scaling" in raw_original_configat the top level. If the original config has rope_scaling only nested undertext_config(common in VLMs), the entire restoration block is skipped. -
Redundant branching:
fetch_model_config()(imported at line 35) already handles both HuggingFace Hub IDs and local paths internally. Current code at lines 254–266 duplicates this by branching onis_huggingface_model_id()again. -
print() in library code: Line 278 uses
print()instead ofwarnings.warn(), which is inappropriate for library code.
The suggested refactor correctly:
- Unifies to always use
fetch_model_config() - Checks both top-level and
text_confignestedrope_scaling - Replaces
print()withwarnings.warn()
Apply the provided diff to resolve all three issues.
🤖 Prompt for AI Agents
In modelopt/torch/export/model_config_utils.py around lines 234-302, replace the
redundant branching that separately handles Hub vs local paths and the print()
call: always call fetch_model_config(original_model_path) (it already handles
Hub and local), then if raw_original_config is not None check and restore
rope_scaling from either raw_original_config["rope_scaling"] or
raw_original_config.get("text_config", {}).get("rope_scaling") (apply to both
config_data top-level and config_data["text_config"] if present) when types
indicate original was "mrope" and current is "default"; replace the print(...)
with warnings.warn(...); if fetch_model_config returns None warn that original
config could not be retrieved.
What does this PR do?
Type of change: bugfix
Overview:
The rope scaling type is changed to
defaultin the exportedconfig.jsonwhen exporting the Qwen2.5 VL model.The issue is that the transformers library itself is modifying the rope_scaling configuration when it loads the Qwen2.5-VL model. This happens during the
AutoConfig.from_pretrained()call, likely in the Qwen2.5-VL model’s configuration processing code.This PR is to preserve the original rope scaling type.
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
New Features
Bug Fixes
Chores