Replies: 1 comment 3 replies
-
Thanks for the question! Just to make sure I understand:
Did you mean "Basically the sharding pattern |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Suppose I have the following line within a function:
where
x
is a tensor of shape(sequence_length, dim)
. This will apply the constraint to the GSPMD partioner to shard the tensor along the hidden dimension and not along the sequence dimension.However, suppose this function is then
vmap
ed to add a batch dimension, which we want to also shard across'data'
. Basically, the sharding pattern('mesh', None, 'tensor')
.How can we apply this sharding constraint to the vmapped function?
Beta Was this translation helpful? Give feedback.
All reactions