Skip to content

Comments

fix(models): apply embedding_multiplier to inputs_embeds in GraniteMoeHybrid#35026

Closed
nightcityblade wants to merge 1 commit intovllm-project:mainfrom
nightcityblade:fix/issue-34812
Closed

fix(models): apply embedding_multiplier to inputs_embeds in GraniteMoeHybrid#35026
nightcityblade wants to merge 1 commit intovllm-project:mainfrom
nightcityblade:fix/issue-34812

Conversation

@nightcityblade
Copy link

Summary

Fixes #34812

When inputs_embeds is provided to GraniteMoeHybridModel.forward(), the embedding_multiplier scaling was only applied in the input_ids branch, causing garbage output when using input embeddings (e.g. enable_prompt_embeds=True).

Fix

Move the embedding_multiplier scaling outside the if/else branch so it applies to hidden_states regardless of whether it came from input_ids or inputs_embeds. This aligns with how granite.py and granitemoe.py already handle it correctly.

Changes

  • vllm/model_executor/models/granitemoehybrid.py: 1 line moved (dedented)

…eHybrid

When inputs_embeds is provided to GraniteMoeHybridModel.forward(),
the embedding_multiplier scaling was skipped, causing garbage output.
This aligns the behavior with granite.py and granitemoe.py which
correctly apply the multiplier regardless of the embedding source.

Fixes vllm-project#34812
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@dosubot
Copy link

dosubot bot commented Feb 21, 2026

Related Documentation

Checked 0 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request addresses a bug in GraniteMoeHybridModel.forward() where embedding_multiplier was not applied to inputs_embeds. The fix correctly moves the scaling operation outside the conditional branch, ensuring it applies to hidden_states regardless of its origin. This is a critical fix for correctness when using input embeddings.

@DarkLight1337
Copy link
Member

Thanks, but there is already a PR #34813

@nightcityblade
Copy link
Author

Closing as duplicate — there's already an existing PR at #34813 addressing this. Thanks for pointing that out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: GraniteMoeHybridModel not applying embedding_multiplier to input embeddings

2 participants