-
Notifications
You must be signed in to change notification settings - Fork 468
[Feat]mtp aclgraph support #3244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR adds ACLGraph support for the MTP model. The changes are correct in principle, but they expose a pre-existing critical bug in the speculative decoding loop in mtp_proposer.py
. The batch_descriptor
and other graph-related parameters are not updated across loop iterations, which will lead to incorrect model outputs when using more than one speculative token. I've left a comment with details on the issue and a suggested fix.
68201ac
to
fd1994e
Compare
self.drafter.dummy_run( | ||
num_tokens=num_tokens, | ||
with_prefill=with_prefill, | ||
skip_attn=True, | ||
num_reqs=num_reqs, | ||
num_tokens_across_dp=num_tokens_across_dp) | ||
num_tokens_across_dp=num_tokens_across_dp, | ||
aclgraph_runtime_mode=aclgraph_runtime_mode, | ||
batch_descriptor=batch_descriptor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to change the signature of dummy_run
in class Proposer
and all of its sub-classes
Signed-off-by: anon189Ty <[email protected]>
fd1994e
to
58959a1
Compare
What this PR does / why we need it?
Currently, MTP Model in deepseek can not be capture in ACLGraph. This PR is use to allow MTP to be captured in ACLGraph mode.
Does this PR introduce any user-facing change?
How was this patch tested?