Skip to content

Conversation

zhihaofang1017
Copy link

@zhihaofang1017 zhihaofang1017 commented Aug 13, 2025

What this PR does / why we need it?

Fix precision and inference length issues with MRoPE operator in multi-image long sequences for Qwen2.5-VL-7B

This PR addresses two critical issues observed with the MRoPE operator:

  1. Precision degradation in V1 model outputs
  2. Inference length limitations (only generating a few characters) when processing multi-image long sequences
    20250813-210227

Root cause analysis revealed that most CANN operators internally perform contiguous() operations to ensure accessing contiguous data during computations. However, the ”positions“ tensor was missing this crucial step, leading to incorrect memory access and corrupted values during operator calculations.

The fix adds a contiguous() operation on the positions tensor at the Python level, ensuring proper memory layout consistency with other tensor operations.

Does this PR introduce any user-facing change?

no

How was this patch tested?

Comparing the accuracy of V0 and V1, the error meets the standard
20250813-205710

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical accuracy and stability issue with the MRoPE operator on Ascend NPUs. The root cause was correctly identified as a non-contiguous positions tensor being passed to the npu_mrope kernel, leading to incorrect memory access. The fix, which involves adding a .contiguous() call to the positions tensor before the kernel invocation, is direct, correct, and consistent with how other tensor arguments are handled in the same function call. This change effectively resolves the reported problem.

@zhihaofang1017 zhihaofang1017 closed this by deleting the head repository Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant