Skip to content

[AutoDeploy]: Spec Dec Smoke Tests #9919

@govind-ramnarayan

Description

@govind-ramnarayan

🚀 The feature, motivation and pitch

Currently we do not have a good way to write smoke tests for speculative decoding in AutoDeploy, because we have no way to prevent the draft model (whether it is a full-fledged draft or Eagle) from loading weights. These are not too many weights, but maybe we should skip it anyway.

It would be good to have lightweight smoke tests for this - see if it is easy to skip loading weights for the draft model.

Alternatives

Since drafters are pretty small, especially for Eagle checkpoints, we could just load them with the weights and say it's okay for our smoke tests. At least we don't load the target model weights.

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Labels

AutoDeploy<NV> AutoDeploy BackendSpeculative Decoding<NV>MTP/Eagle/Medusa/Lookahead/Prompt-Lookup-Decoding/Draft-Target-Model/ReDrafter

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions