Skip to content

Conversation

gcunhase
Copy link
Contributor

@gcunhase gcunhase commented Sep 9, 2025

What does this PR do?

Type of change: Bug fix

Overview: Fixed 'Invalid tensor data type 0' in ONNX Autocast by ensuring that intermediate tensors have a defined data type.

Usage

$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx --high_precision_dtype=fp16

Testing

Internal model.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: N/A
  • Did you add or update any necessary documentation?: N/A
  • Did you update Changelog?: No

Additional Information

Related: #302

Summary by CodeRabbit

  • Bug Fixes

    • Preserve known static dimensions during conversion to avoid unintended shape changes.
    • Fill undefined tensor element types after initial inference to prevent downstream errors.
  • Improvements

    • Align shape handling between intermediate values and outputs for consistent behavior.
    • Improve robustness of shape/type inference while keeping public interfaces unchanged.

@gcunhase gcunhase requested a review from a team as a code owner September 9, 2025 14:04
@gcunhase gcunhase requested a review from i-riyad September 9, 2025 14:04
Copy link

coderabbitai bot commented Sep 9, 2025

Walkthrough

Internal updates to PrecisionConverter.convert() refine shape sanitization to preserve known static dims, add a post-inference step to replace UNDEFINED tensor element types with the target low-precision type, and align handling of graph value_info and outputs. A new private helper _ensure_types_are_defined() is added. Public API unchanged.

Changes

Cohort / File(s) Summary
ONNX autocast precision handling
modelopt/onnx/autocast/precisionconverter.py
Adjust conditional dim_param updates to avoid overwriting known dim_value; call _ensure_types_are_defined() after an initial inference pass to replace UNDEFINED elem_type with the target low-precision ONNX type; apply the conservative sanitization to both value_info and outputs; add private helper _ensure_types_are_defined(); no public API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant PC as PrecisionConverter
  participant SI as ShapeInference
  participant GI as Graph(value_info/outputs)

  U->>PC: convert(model, target_dtype)
  rect rgb(245,248,255)
    note right of PC: Initial inference without strict type checks
    PC->>SI: infer_shapes(check_type=false)
    SI-->>PC: shapes/types (may include UNDEFINED)
  end

  rect rgb(245,255,245)
    note right of PC: New step: ensure types defined
    PC->>GI: _ensure_types_are_defined()<br/>Replace UNDEFINED elem_type -> target low-precision
    GI-->>PC: value_info updated
  end

  rect rgb(255,248,240)
    note right of PC: Conservative shape sanitization
    PC->>GI: For each dim: set dim_param="unk" only when dim_value is concrete
    GI-->>PC: value_info & outputs aligned
  end

  PC-->>U: Converted model
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "[5452146] Fix: 'Invalid tensor data type 0'" succinctly captures the primary change, clearly referencing the specific error being addressed and aligning with the pull request’s focus on defining intermediate tensor data types to resolve it.
Description Check ✅ Passed The description directly addresses the bug fix by explaining that the PR ensures intermediate tensors have defined data types to resolve the “Invalid tensor data type 0” error and includes relevant usage and testing information consistent with the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Poem

I twitch my whiskers at dims that hide,
I hop through graphs where types collide.
I set the unknowns to tidy the trail,
Fill UNDEFINEDs so checks won't fail.
A tiny rabbit, precise and spry—🥕


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0001ee and 5bd42c5.

📒 Files selected for processing (1)
  • modelopt/onnx/autocast/precisionconverter.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • modelopt/onnx/autocast/precisionconverter.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: build-docs
  • GitHub Check: code-quality
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gcunhase gcunhase force-pushed the dev/gcunhasergio/fix_invalid_type_0_5452146 branch from 87ca671 to d0001ee Compare September 9, 2025 14:06
@gcunhase gcunhase force-pushed the dev/gcunhasergio/fix_invalid_type_0_5452146 branch from d0001ee to 5bd42c5 Compare September 9, 2025 14:12
@gcunhase gcunhase self-assigned this Sep 9, 2025
@gcunhase gcunhase requested a review from galagam September 9, 2025 14:15
Copy link

codecov bot commented Sep 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.86%. Comparing base (a6fa34c) to head (5bd42c5).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #308      +/-   ##
==========================================
- Coverage   73.93%   73.86%   -0.07%     
==========================================
  Files         172      172              
  Lines       17408    17415       +7     
==========================================
- Hits        12870    12864       -6     
- Misses       4538     4551      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gcunhase gcunhase merged commit d6d2e75 into NVIDIA:main Sep 9, 2025
22 checks passed
benchislett pushed a commit that referenced this pull request Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants