[model parallelism] Is it possible to pre-shard JAX arrays without relying on annotation `with_sharding_constraint`? #8597

sudhakarsingh27 · 2021-11-18T19:46:07Z

sudhakarsingh27
Nov 18, 2021

Currently, user needs to provide JAX with sharding information using with_sharding_constraint to achieve model-parallelism. Although JAX can automatically do the sharding, what if we want to manually shard the arrays apriori so that the JAX doesn't spend time in sharding of the arrays at runtime (at least for the first time).

Is it possible to do so?

I think basically it'd mean that we pre-shard the arrays and provide that information with something like with_sharding_constraint API but then JAX wouldn't have to do the sharding. Taking it a step further, can we then do pre-sharding of arrays (not just for the first time, but during model run as well) which would be separate from JAX but then providing this sharding information to JAX would still allow it to add any communication primitives necessary without worrying about sharding. I'm not sure how this would be compatible with jit though. I would like to know any thoughts on this as well.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[model parallelism] Is it possible to pre-shard JAX arrays without relying on annotation `with_sharding_constraint`? #8597

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[model parallelism] Is it possible to pre-shard JAX arrays without relying on annotation with_sharding_constraint? #8597

Uh oh!

sudhakarsingh27 Nov 18, 2021

Replies: 0 comments

[model parallelism] Is it possible to pre-shard JAX arrays without relying on annotation `with_sharding_constraint`? #8597

sudhakarsingh27
Nov 18, 2021