-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Labels
Disaggregated serving<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Description
🚀 The feature, motivation and pitch
See bug description:
https://nvbugspro.nvidia.com/bug/5680312
#9379
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Disaggregated serving<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support