-
Notifications
You must be signed in to change notification settings - Fork 154
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Integration of the sparse attention DeepSeek-V3.2-Exp model
Motivation
Based on my understanding, DeepSeek-V3.2-Exp is an experimental sparse attention model with better efficiency. I'm wondering what this improved efficiency would translate to on a CPU-GPU deployment. Right now I don't think there is a way to run DeepSeek-V3.2-Exp with a CPU (more RAM) - GPU (limited VRAM) setup. I think it would be interesting if feasible.
Possible Implementation
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request