Skip to content

Fix tensor device mismatch in deepseek#1029

Open
kamil-kaczor wants to merge 4 commits intoreleases/v0.15.1from
fix_deepseek
Open

Fix tensor device mismatch in deepseek#1029
kamil-kaczor wants to merge 4 commits intoreleases/v0.15.1from
fix_deepseek

Conversation

@kamil-kaczor
Copy link
Collaborator

No description provided.

Signed-off-by: Kamil Kaczor <kamil.kaczor@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a tensor device mismatch issue in the DeepSeek rotary embedding implementation when using INC (Intel Neural Compressor) quantization. The issue occurs because upstream code hardcodes device placement to HPU, but during INC quantization workflows, weights are loaded to CPU first (via load_config.device="cpu"), causing runtime errors when utility functions create tensors on different devices.

Changes:

  • Added imports for yarn_find_correction_range and yarn_linear_ramp_mask utility functions from vLLM's rotary embedding common module
  • Override _compute_inv_freq method to remove explicit device specification, allowing tensors to follow the torch.set_default_device context
  • Override _compute_cos_sin_cache method to maintain device consistency with the compute_inv_freq implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Kamil Kaczor <kamil.kaczor@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants