You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we do not copy the `scattermoe` kernels into this respository, to this is an additional manual install:
49
50
50
51
```
51
52
# this will install the kernel-hyperdrive fork with the scattermoe triton kernels
52
53
pip install -r requirements-khd.txt
54
+
55
+
### Known Issues
56
+
57
+
These are currently some known issues not yet resolved
58
+
- The design currently does a swap for the mixture-of-expert module with [ScatterMoE](./src/fms_acceleration_moe/utils/scattermoe.py). This affects the `state_dict` of the model, so any saved checkpoint may need to be converted back to original.
59
+
- should eventually remove the dependency on an external `kernel-hyperdrive` repository.
60
+
- now support only loading *sharded* `safetensor` non-GGUF MoE checkpoints. This is a reasonable assumption since MoE checkpoints are typically above the size limit that prevents it being saved into a single checkpoint filed.
61
+
- currently only supports `StateDictType.SHARDED_STATE_DICT` because the implementation uses `DTensors` which have limited support for full state dicts. However for efficiency considerations, sharded state dicts are the most efficient.
0 commit comments