Skip to content

Commit 5b35eae

Browse files
authored
Add ExpertParallel Mixture-of-Experts Plugin (#99)
* initial commit Signed-off-by: Yu Chin Fabian Lim <[email protected]> * include prepare_scattermoe Signed-off-by: Yu Chin Fabian Lim <[email protected]> * fixes and add scenarios-moe. Allow gradient_accum=null mode Signed-off-by: Yu Chin Fabian Lim <[email protected]> * missed out on CONTENTS.yaml Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update readme, code cleanup, add comments and initial bench Signed-off-by: Yu Chin Fabian Lim <[email protected]> * more cleanup and update pf bench Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add more comments and minor refactoring Signed-off-by: Yu Chin Fabian Lim <[email protected]> * finish up comments Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add padding free to granite moe Signed-off-by: Yu Chin Fabian Lim <[email protected]> * fmt and lint. Signed-off-by: Yu Chin Fabian Lim <[email protected]> * install workflow + more fmt + fix test Signed-off-by: Yu Chin Fabian Lim <[email protected]> * go back to dtensors for sharded checkpoints Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add scattermoe checkpoint restorer utility Signed-off-by: Yu Chin Fabian Lim <[email protected]> * fmt + lint Signed-off-by: Yu Chin Fabian Lim <[email protected]> * more cleanup Signed-off-by: Yu Chin Fabian Lim <[email protected]> * improved documention on state dict inferernce Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add more test on inferring checkpoint metadat Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update configs for mixtral Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update granite configs Signed-off-by: Yu Chin Fabian Lim <[email protected]> * fix readme and update GraniteMoE to FOAK Signed-off-by: Yu Chin Fabian Lim <[email protected]> * commit benches Signed-off-by: Yu Chin Fabian Lim <[email protected]> --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]>
1 parent d767e33 commit 5b35eae

File tree

52 files changed

+4658
-12
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+4658
-12
lines changed

.github/workflows/build-and-publish.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ jobs:
1515
- "accelerated-peft"
1616
- "fused-ops-and-kernels"
1717
- "attention-and-distributed-packing"
18+
- "accelerated-moe"
1819

1920
permissions:
2021
id-token: write # IMPORTANT: this permission is mandatory for trusted publishing

.github/workflows/format.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ jobs:
3030
- "accelerated-peft"
3131
- "fused-ops-and-kernels"
3232
- "attention-and-distributed-packing"
33+
- "accelerated-moe"
3334

3435
steps:
3536
- uses: actions/checkout@v4

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ Plugin | Description | Depends | License | Status
3434
[accelerated-peft](./plugins/accelerated-peft/README.md) | For PEFT-training, e.g., 4bit QLoRA. | Huggingface<br>AutoGPTQ | Apache 2.0<br>MIT | Alpha
3535
[fused-op-and-kernels](./plugins/fused-ops-and-kernels/README.md) | Fused LoRA and triton kernels (e.g., fast cross-entropy, rms, rope) | -- | Apache 2.0 [(contains extracted code)](./plugins/fused-ops-and-kernels/README.md#code-extracted-from-unsloth)| Beta
3636
[attention-and-distributed-packing](./plugins/attention-and-distributed-packing/README.md) | Padding-Free Flash Attention Computation | flash-attn | Apache 2.0 | Beta
37-
MOE-training-acceleration | [MegaBlocks](https://github.com/databricks/megablocks) inspired triton Kernels and acclerations for Mixture-of-Expert models | | Apache 2.0 | Coming Soon
37+
[accelerated-moe](./plugins/accelerated-moe/README.md) | Triton Kernels for Mixture-of-Expert parallel, inspired by [ScatterMoe](https://github.com/shawntan/scattermoe) and [MegaBlocks](https://github.com/databricks/megablocks) | | Apache 2.0 | Beta
3838

3939
## Usage with FMS HF Tuning
4040

plugins/accelerated-moe/.isort.cfg

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[settings]
2+
profile=black
3+
from_first=true
4+
import_heading_future=Future
5+
import_heading_stdlib=Standard
6+
import_heading_thirdparty=Third Party
7+
import_heading_firstparty=First Party
8+
import_heading_localfolder=Local
9+
known_firstparty=
10+
known_localfolder=tuning

0 commit comments

Comments
 (0)