Support for custom layer sharding #9882

DavidPeleg6 · 2021-10-10T13:55:20Z

DavidPeleg6
Oct 10, 2021

Hello,
I recently noticed the inclusion of the simplified sharding API mentioned in #9375 , and I wonder whether it would be possible to utilize this api to perform a variable amount of splits to any layer.
For example, if I have a two layer MLP and a node with 5 gpus, would it be possible to explicitly split the first layer to gpus[0,1] , and the second layer to gpus[2,3,4]?

tchaton · 2021-10-11T08:49:29Z

tchaton
Oct 11, 2021
Maintainer

Dear @DavidPeleg6,

Yes, in theory it should be possible. The TorchSharded API is still experiemental, so expect some changes.

Here is the MOE implementation from DeepSpeed: https://github.com/microsoft/DeepSpeed/blob/9f5939d2a7bcdd2953d52a0baf09ede485221a81/deepspeed/moe/layer.py#L18.

Some good learning on how they actually implement MOE :)

Best,
T.C

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for custom layer sharding #9882

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Support for custom layer sharding #9882

Uh oh!

DavidPeleg6 Oct 10, 2021

Replies: 1 comment

Uh oh!

tchaton Oct 11, 2021 Maintainer

DavidPeleg6
Oct 10, 2021

tchaton
Oct 11, 2021
Maintainer