Conversation
felixwqp
left a comment
There was a problem hiding this comment.
Can I assume implicitly this will only work for shard_map-based sharding? like how ragged_all_to_all is used?
I'm not sure. I haven't thought that far ahead. When I support these async collectives in JAX, though, I do plan on only supporting shard_map at first. |
89da850 to
3926874
Compare
|
Would this RFC extend to async dynamic-slice/dynamic-update-slice? |
This should naturally extend to these ops. I'm OK to bring them in scope under the umbrella of "known ops that we want to have an async decomposition by a backend" |
|
|
||
| This RFC introduces an `async_start` op and an `async_done` op that allow you to | ||
| run a collective asynchronously. We also introduce a new future type (e.g., | ||
| `future<tensor<2xf32>>`) to represent the output of a start operation. |
There was a problem hiding this comment.
Can we add some details on how "in the future we are likely to consider adding scheduling dependencies between async ops and other ops to enforce an execution orderings, but in the meantime async ops are used to denote that a backend should use an async decomposition for a given op.
No description provided.