You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[mlir][NVVM] Add support for few more fence Ops (llvm#170251)
This commit adds support for the following fence Ops:
- fence.sync_restrict
- fence.proxy.sync_restrict
The commit also moves memory.barrier into the Membar/Fence section, migrates fence.mbarrier.init to intrinsics and consolidates fence related tests under nvvm/fence.mlir and nvvm/fence-invalid.mlir
`membar` operation guarantees that prior memory accesses requested by this
1402
+
thread are performed at the specified `scope`, before later memory
1403
+
operations requested by this thread following the membar instruction.
1404
+
1405
+
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar)
let summary = "Uni-directional thread fence operation";
1424
+
let description = [{
1425
+
The `nvvm.fence.sync_restrict` Op restricts the class of memory
1426
+
operations for which the fence instruction provides the memory ordering guarantees.
1427
+
`sync_restrict` restricts `acquire` memory semantics to `shared_cluster` and
1428
+
`release` memory semantics to `shared_cta` with cluster scope.
1429
+
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)
Fence operation that applies on the prior nvvm.mbarrier.init
1443
+
1444
+
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)
`membar` operation guarantees that prior memory accesses requested by this
1457
-
thread are performed at the specified `scope`, before later memory
1458
-
operations requested by this thread following the membar instruction.
1459
-
1460
-
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar)
let summary = "Uni-directional proxy fence operation with sync_restrict";
1557
+
let description = [{
1558
+
The `nvvm.fence.proxy.sync_restrict` Op used to establish
1559
+
ordering between a prior memory access performed between proxies. Currently,
1560
+
the ordering is only supported between async and generic proxies. `sync_restrict`
1561
+
restricts `acquire` memory semantics to `shared_cluster` and `release` memory
1562
+
semantics to `shared_cta` with cluster scope.
1563
+
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)
Fence operation that applies on the prior nvvm.mbarrier.init
1521
-
1522
-
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)
0 commit comments