33Status: In Review<br />
44Initial version: 02/09/2026<br />
55Last updated: 02/23/2026<br />
6- Discussion thread: https://github.com/openxla/stablehlo/pull/2897/changes
6+ Discussion thread: [ here ] [ discussion_thread ]
77
88## Motivation
99
@@ -60,7 +60,7 @@ arguments. The op also has a single region that must contain only a call to one
6060of the six collective ops. ` async_start ` returns a variadic number of futures.
6161Here's an example:
6262
63- ```
63+ ``` text
6464"stablehlo.async_start"(%x) ({
6565 %y = "stablehlo.all_gather"(%x) {
6666 all_gather_dim = 1 : i64,
@@ -76,20 +76,20 @@ collective.
7676We also introduce an ` async_done ` op which takes a variadic number of futures
7777and unwraps them. Here's an example.
7878
79- ```
79+ ``` text
8080"stablehlo.async_done"(%f1, %f2) : (!stablehlo.future<tensor<4x4xf32>>, !stablehlo.future<tensor<2xf32>>) -> (tensor<4x4xf32>, tensor<2xf32>)
8181```
8282
8383## Alternatives
8484
8585### Fully Generic Async Ops
8686
87- https://github.com/openxla/stablehlo/pull/2551 is a StableHLO RFC that proposes
88- adding generic ` async_start ` and ` async_done ` ops that can be used to call * any*
89- function asynchronously, not just collectives. Here's an example from the RFC
90- that performs an asynchronous add:
87+ [ This ] [ generic_async_rfc ] is a StableHLO RFC that proposes adding generic
88+ ` async_start ` and ` async_done ` ops that can be used to call * any* function
89+ asynchronously, not just collectives. Here's an example from the RFC that
90+ performs an asynchronous add:
9191
92- ```
92+ ``` text
9393// %init_i: 2
9494// %init_sum: 3
9595%future = "stablehlo.async_start"(
@@ -123,7 +123,7 @@ These ops are identical to their non-asynchronous counterparts. They take the
123123same arguments and have the same constraints. The only difference is that they
124124return futures. Here's an example:
125125
126- ```
126+ ``` text
127127%future = "stablehlo.collective_permute_start"(%operand) {
128128 source_target_pairs = dense<[[0, 1], [1, 2]]> : tensor<2x2xi64>,
129129 channel_handle = #stablehlo.channel_handle<handle = 0, type = 0>
@@ -142,7 +142,7 @@ We could also introduce six **done ops**.
142142A done op takes a ` future<T> ` as an argument and returns a ` T ` . Continuing the
143143example above:
144144
145- ```
145+ ``` text
146146%result = "stablehlo.collective_permute_done"(%future) : (future<tensor<2x2xi64>>) -> tensor<2x2xi64>
147147```
148148
@@ -152,7 +152,7 @@ Start ops could return regular tensors instead of futures. The value of these
152152tensors, however, would be indeterminate. The tensors should not be used in any
153153way besides as arguments to done ops. Here's an example:
154154
155- ```
155+ ``` text
156156%indeterminate = "stablehlo.collective_permute_start"(%operand) {
157157 source_target_pairs = dense<[[0, 1], [1, 2]]> : tensor<2x2xi64>,
158158 channel_handle = #stablehlo.channel_handle<handle = 0, type = 0>
@@ -168,7 +168,7 @@ introducing a new future type. However, it is less type-safe.
168168This RFC has every collective return the same future type. Thus, the following
169169code is well-typed but erroneous.
170170
171- ```
171+ ``` text
172172%future = "stablehlo.collective_permute_start"(%operand) {
173173 source_target_pairs = dense<[[0, 1], [1, 2]]> : tensor<2x2xi64>,
174174 channel_handle = #stablehlo.channel_handle<handle = 0, type = 0>
@@ -185,3 +185,5 @@ This would introduce more type safety.
185185
186186[ async_collective_creator ] : https://github.com/openxla/xla/blob/391c1c5fdadde89ee81886495d32dc32f9238af1/xla/hlo/transforms/collectives/async_collective_creator.h#L38
187187[ async_hlo ] : https://openxla.org/xla/async_ops
188+ [ discussion_thread ] : https://github.com/openxla/stablehlo/pull/2897/changes
189+ [ generic_async_rfc ] : https://github.com/openxla/stablehlo/pull/2551
0 commit comments