-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[MLIR][NVVM] [NFC] Add summary and description fields for several OPs #156726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][NVVM] [NFC] Add summary and description fields for several OPs #156726
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-mlir-llvm @llvm/pr-subscribers-mlir Author: Stefan (smada3) ChangesSeveral operations in the NVVM dialect were missing summaries and descriptions. This PR
Documentation available here: mbarrier, barrier Patch is 21.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156726.diff 1 Files Affected:
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 9d93b4efe7a5b..5338bb2336ed9 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -494,6 +494,26 @@ def NVVM_PMEventOp : NVVM_PTXBuilder_Op<"pmevent">,
/// mbarrier.init instruction with generic pointer type
def NVVM_MBarrierInitOp : NVVM_PTXBuilder_Op<"mbarrier.init">,
Arguments<(ins LLVM_AnyPointer:$addr, I32:$count, PtxPredicate:$predicate)> {
+ let summary = "MBarrier Initialization Op";
+ let description = [{
+ The `nvvm.mbarrier.init` operation initializes an mbarrier object at the specified
+ memory location.
+
+ This operation initializes the mbarrier object with the following state:
+ - Current phase: 0
+ - Expected arrival count: `count`
+ - Pending arrival count: `count`
+ - Transaction count (tx-count): 0
+
+ The operation takes the following operands:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+ - `count`: Integer specifying the number of threads that will participate in barrier
+ synchronization. Must be in the range [1, 2²⁰ - 1].
+ - `predicate`: Optional predicate for conditional execution.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-init)
+ }];
string llvmBuilder = [{
createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_init, {$addr, $count});
}];
@@ -509,6 +529,13 @@ def NVVM_MBarrierInitOp : NVVM_PTXBuilder_Op<"mbarrier.init">,
/// mbarrier.init instruction with shared pointer type
def NVVM_MBarrierInitSharedOp : NVVM_PTXBuilder_Op<"mbarrier.init.shared", [NVVMRequiresSM<80>, DeclareOpInterfaceMethods<BasicPtxBuilderOpInterface>]>,
Arguments<(ins LLVM_PointerShared:$addr, I32:$count, PtxPredicate:$predicate)> {
+ let summary = "Shared MBarrier Initialization Op";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-init)
+ }];
string llvmBuilder = [{
createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_init_shared, {$addr, $count});
}];
@@ -521,6 +548,21 @@ def NVVM_MBarrierInitSharedOp : NVVM_PTXBuilder_Op<"mbarrier.init.shared", [NVVM
def NVVM_MBarrierInvalOp : NVVM_Op<"mbarrier.inval">,
Arguments<(ins LLVM_AnyPointer:$addr)> {
+ let summary = "MBarrier Invalidation Operation";
+ let description = [{
+ The `nvvm.mbarrier.inval` operation invalidates an mbarrier object at the
+ specified memory location.
+
+ This operation marks the mbarrier object as invalid, making it safe to repurpose
+ the memory location for other uses or to reinitialize it as a new mbarrier object.
+ It is undefined behavior if the mbarrier object is already invalid.
+
+ The operation takes the following operand:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-inval)
+ }];
string llvmBuilder = [{
createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_inval, {$addr});
}];
@@ -529,6 +571,13 @@ def NVVM_MBarrierInvalOp : NVVM_Op<"mbarrier.inval">,
def NVVM_MBarrierInvalSharedOp : NVVM_Op<"mbarrier.inval.shared">,
Arguments<(ins LLVM_PointerShared:$addr)> {
+ let summary = "Shared MBarrier Invalidation Operation";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-inval)
+ }];
string llvmBuilder = [{
createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_inval_shared, {$addr});
}];
@@ -538,6 +587,22 @@ def NVVM_MBarrierInvalSharedOp : NVVM_Op<"mbarrier.inval.shared">,
def NVVM_MBarrierArriveOp : NVVM_Op<"mbarrier.arrive">,
Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_AnyPointer:$addr)> {
+ let summary = "MBarrier Arrive Operation";
+ let description = [{
+ The `nvvm.mbarrier.arrive` operation performs an arrive-on operation on the
+ mbarrier object at the specified address. Uses the default `.release.cta` semantics.
+
+ This operation causes the executing thread to signal its arrival at the barrier.
+ The operation returns an opaque 64-bit value that captures the phase of the
+ mbarrier object prior to the arrive-on operation. The contents of this state
+ value are implementation-specific.
+
+ The operation takes the following operand:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
+ }];
string llvmBuilder = [{
$res = createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_arrive, {$addr});
}];
@@ -547,6 +612,13 @@ def NVVM_MBarrierArriveOp : NVVM_Op<"mbarrier.arrive">,
def NVVM_MBarrierArriveSharedOp : NVVM_Op<"mbarrier.arrive.shared">,
Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_PointerShared:$addr)> {
+ let summary = "Shared MBarrier Arrive Operation";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
+ }];
string llvmBuilder = [{
$res = createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_arrive_shared, {$addr});
}];
@@ -556,6 +628,25 @@ def NVVM_MBarrierArriveSharedOp : NVVM_Op<"mbarrier.arrive.shared">,
def NVVM_MBarrierArriveNocompleteOp : NVVM_Op<"mbarrier.arrive.nocomplete">,
Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_AnyPointer:$addr, I32:$count)> {
+ let summary = "MBarrier Arrive No-Complete Operation";
+ let description = [{
+ The `nvvm.mbarrier.arrive.nocomplete` operation performs an arrive-on operation
+ on the mbarrier object with the guarantee that it will not cause the barrier to
+ complete its current phase. Uses the default `.release.cta` semantics.
+
+ This operation causes the executing thread to signal its arrival at the barrier
+ with a specified count, but ensures that the barrier phase will not complete as
+ a result of this operation. The operation returns an opaque 64-bit value that
+ captures the phase of the mbarrier object prior to the arrive-on operation.
+
+ The operation takes the following operands:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+ - `count`: Integer specifying the count argument to the arrive-on operation.
+ Must be in the valid range as specified in the mbarrier object contents.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
+ }];
string llvmBuilder = [{
$res = createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_arrive_noComplete, {$addr, $count});
}];
@@ -565,6 +656,13 @@ def NVVM_MBarrierArriveNocompleteOp : NVVM_Op<"mbarrier.arrive.nocomplete">,
def NVVM_MBarrierArriveNocompleteSharedOp : NVVM_Op<"mbarrier.arrive.nocomplete.shared">,
Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_PointerShared:$addr, I32:$count)> {
+ let summary = "Shared MBarrier Arrive No-Complete Operation";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
+ }];
string llvmBuilder = [{
$res = createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_arrive_noComplete_shared, {$addr, $count});
}];
@@ -573,6 +671,26 @@ def NVVM_MBarrierArriveNocompleteSharedOp : NVVM_Op<"mbarrier.arrive.nocomplete.
def NVVM_MBarrierArriveExpectTxOp : NVVM_PTXBuilder_Op<"mbarrier.arrive.expect_tx">,
Arguments<(ins LLVM_AnyPointer:$addr, I32:$txcount, PtxPredicate:$predicate)> {
+ let summary = "MBarrier Arrive with Expected Transaction Count";
+ let description = [{
+ The `nvvm.mbarrier.arrive.expect_tx` operation performs an expect-tx operation
+ followed by an arrive-on operation on the mbarrier object. Uses the default
+ `.release.cta` semantics.
+
+ This operation first performs an expect-tx operation with the specified transaction
+ count, then performs an arrive-on operation with an implicit count of 1. The
+ expect-tx operation updates the expected transaction count for the barrier.
+
+ The operation takes the following operands:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+ - `txcount`: An unsigned integer specifying the expected transaction count
+ for the expect-tx operation. This represents the number of asynchronous transactions
+ expected to complete before the barrier phase completes.
+ - `predicate`: Optional predicate for conditional execution.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
+ }];
let assemblyFormat = "$addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)";
let extraClassDefinition = [{
std::string $cppClass::getPtx() { return std::string("mbarrier.arrive.expect_tx.b64 _, [%0], %1;"); }
@@ -580,7 +698,14 @@ def NVVM_MBarrierArriveExpectTxOp : NVVM_PTXBuilder_Op<"mbarrier.arrive.expect_t
}
def NVVM_MBarrierArriveExpectTxSharedOp : NVVM_PTXBuilder_Op<"mbarrier.arrive.expect_tx.shared">,
- Arguments<(ins LLVM_PointerShared:$addr, I32:$txcount, PtxPredicate:$predicate)> {
+ Arguments<(ins LLVM_PointerShared:$addr, I32:$txcount, PtxPredicate:$predicate)> {
+ let summary = "Shared MBarrier Arrive with Expected Transaction Count";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-arrive)
+ }];
let assemblyFormat = "$addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)";
let extraClassDefinition = [{
std::string $cppClass::getPtx() { return std::string("mbarrier.arrive.expect_tx.shared.b64 _, [%0], %1;"); }
@@ -588,7 +713,36 @@ def NVVM_MBarrierArriveExpectTxSharedOp : NVVM_PTXBuilder_Op<"mbarrier.arrive.ex
}
def NVVM_MBarrierTryWaitParityOp : NVVM_PTXBuilder_Op<"mbarrier.try_wait.parity">,
- Arguments<(ins LLVM_AnyPointer:$addr, I32:$phase, I32:$ticks)> {
+ Arguments<(ins LLVM_AnyPointer:$addr, I32:$phase, I32:$ticks)> {
+ let summary = "MBarrier Potentially-Blocking Try Wait with Phase Parity";
+ let description = [{
+ The `nvvm.mbarrier.try_wait.parity` operation performs a potentially-blocking
+ test for the completion of a specific phase of an mbarrier object using phase
+ parity. It uses the default `.acquire.cta` semantics.
+
+ This operation waits for the completion of the mbarrier phase indicated by the
+ phase parity. While it uses the underlying PTX `mbarrier.try_wait.parity`
+ instruction, this MLIR operation generates a loop that enforces the test to
+ complete before continuing execution, ensuring the barrier phase is actually
+ completed rather than potentially timing out.
+
+ The operation takes the following operands:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+ - `phase`: An integer specifying the phase parity (0 or 1). Even phases
+ have parity 0, odd phases have parity 1.
+ - `ticks`: An unsigned integer specifying the suspend time hint in
+ nanoseconds. This may be used instead of the system-dependent time limit.
+
+ **Implementation behavior**:
+ This operation generates a PTX loop that repeatedly calls the underlying
+ `mbarrier.try_wait.parity` instruction until the barrier phase completes.
+ Unlike the raw PTX instruction which may return without completion after a
+ timeout, this MLIR operation guarantees completion by continuing to loop until
+ the specified phase is reached.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
+ }];
let assemblyFormat = "$addr `,` $phase `,` $ticks attr-dict `:` type(operands)";
let extraClassDefinition = [{
std::string $cppClass::getPtx() {
@@ -607,7 +761,14 @@ def NVVM_MBarrierTryWaitParityOp : NVVM_PTXBuilder_Op<"mbarrier.try_wait.parity"
}
def NVVM_MBarrierTryWaitParitySharedOp : NVVM_PTXBuilder_Op<"mbarrier.try_wait.parity.shared">,
- Arguments<(ins LLVM_PointerShared:$addr, I32:$phase, I32:$ticks)> {
+ Arguments<(ins LLVM_PointerShared:$addr, I32:$phase, I32:$ticks)> {
+ let summary = "Shared MBarrier Potentially-Blocking Try Wait with Phase Parity";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
+ }];
let assemblyFormat = "$addr `,` $phase `,` $ticks attr-dict `:` type(operands)";
let extraClassDefinition = [{
std::string $cppClass::getPtx() {
@@ -628,6 +789,30 @@ def NVVM_MBarrierTryWaitParitySharedOp : NVVM_PTXBuilder_Op<"mbarrier.try_wait.p
def NVVM_MBarrierTestWaitOp : NVVM_Op<"mbarrier.test.wait">,
Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_AnyPointer:$addr, LLVM_Type:$state)> {
+ let summary = "MBarrier Non-Blocking Test Wait Operation";
+ let description = [{
+ The `nvvm.mbarrier.test.wait` operation performs a non-blocking test for the
+ completion of a specific phase of an mbarrier object. It uses the default
+ `.acquire.cta` semantics.
+
+ This operation tests whether the mbarrier phase specified by the state operand
+ has completed. It is a non-blocking instruction that immediately returns the
+ completion status without suspending the executing thread.
+
+ The operation takes the following operands:
+ - `addr`: A pointer to the memory location of the mbarrier object. Uses generic
+ addressing, but the address must still be in the shared memory space.
+ - `state`: An opaque value returned by a previous `mbarrier.arrive`
+ operation on the same mbarrier object during the current or immediately
+ preceding phase.
+
+ The operation returns a boolean value indicating whether the specified phase
+ has completed:
+ - `true`: The immediately preceding phase has completed
+ - `false`: The phase is still incomplete (current phase)
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
+ }];
string llvmBuilder = [{
$res = createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_test_wait, {$addr, $state});
}];
@@ -637,6 +822,13 @@ def NVVM_MBarrierTestWaitOp : NVVM_Op<"mbarrier.test.wait">,
def NVVM_MBarrierTestWaitSharedOp : NVVM_Op<"mbarrier.test.wait.shared">,
Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_PointerShared:$addr, LLVM_Type:$state)> {
+ let summary = "Shared MBarrier Non-Blocking Test Wait Operation";
+ let description = [{
+ This Op is the same as the generic memory variant except that the mbarrier object
+ should be accessed using a shared-memory pointer instead of a generic-memory pointer.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-try-wait)
+ }];
string llvmBuilder = [{
$res = createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_test_wait_shared, {$addr, $state});
}];
@@ -648,6 +840,15 @@ def NVVM_MBarrierTestWaitSharedOp : NVVM_Op<"mbarrier.test.wait.shared">,
//===----------------------------------------------------------------------===//
def NVVM_Barrier0Op : NVVM_Op<"barrier0"> {
+ let summary = "CTA Barrier Synchronization Op (Barrier ID 0)";
+ let description = [{
+ The `nvvm.barrier0` operation is a convenience operation that performs barrier
+ synchronization and communication within a CTA (Cooperative Thread Array) using
+ barrier ID 0. It is functionally equivalent to `nvvm.barrier` with `barrierId = 0`.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier)
+ }];
+
let assemblyFormat = "attr-dict";
string llvmBuilder = [{
createIntrinsicCall(
@@ -657,6 +858,34 @@ def NVVM_Barrier0Op : NVVM_Op<"barrier0"> {
}
def NVVM_BarrierOp : NVVM_Op<"barrier", [AttrSizedOperandSegments]> {
+ let summary = "CTA Barrier Synchronization Op";
+ let description = [{
+ The `nvvm.barrier` operation performs barrier synchronization and communication
+ within a CTA (Cooperative Thread Array). It causes executing threads to wait for
+ all non-exited threads participating in the barrier to arrive.
+
+ The operation takes two optional operands:
+
+ - `barrierId`: Specifies a logical barrier resource with value 0 through 15.
+ Each CTA instance has sixteen barriers numbered 0..15. Defaults to 0 if not specified.
+ - `numberOfThreads`: Specifies the number of threads participating in the barrier.
+ When specified, the value must be a multiple of the warp size. If not specified,
+ all threads in the CTA participate in the barrier.
+
+ The barrier operation guarantees that when the barrier completes, prior memory
+ accesses requested by participating threads are performed relative to all threads
+ participating in the barrier. It also ensures that no new memory access is
+ requested by participating threads before the barrier completes.
+
+ When a barrier completes, the waiting threads are restarted without delay, and
+ the barrier is reinitialized so that it can be immediately reused.
+
+ This operation generates the `.aligned` version of the PTX barrier instruction,
+ indicating that all threads in the CTA execute the same barrier instruction.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-...
[truncated]
|
Please review @durga4github |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the document. I think this PR significantly improves our documentation. I’ve left a few comments, and it’s great to see you taking the initiative on this.
…ing guarantees for try_wait and test.wait
string llvmBuilder = [{ | ||
createIntrinsicCall(builder, llvm::Intrinsic::nvvm_mbarrier_inval_shared, {$addr}); | ||
}]; | ||
let assemblyFormat = "$addr attr-dict `:` type(operands)"; | ||
} | ||
|
||
def NVVM_MBarrierArriveOp : NVVM_Op<"mbarrier.arrive">, | ||
Results<(outs LLVM_Type:$res)>, | ||
Results<(outs I64:$res)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional:
This is a welcome change, but I am wondering if we should do it as a separate PR (and keep this one as Docs-NFC)
let assemblyFormat = "$addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)"; | ||
let extraClassDefinition = [{ | ||
std::string $cppClass::getPtx() { return std::string("mbarrier.arrive.expect_tx.shared.b64 _, [%0], %1;"); } | ||
}]; | ||
} | ||
|
||
def NVVM_MBarrierTryWaitParityOp : NVVM_PTXBuilder_Op<"mbarrier.try_wait.parity">, | ||
Arguments<(ins LLVM_AnyPointer:$addr, I32:$phase, I32:$ticks)> { | ||
Arguments<(ins LLVM_AnyPointer:$addr, I32:$phase, I32:$ticks)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This addr
should also be PointerGeneric (and a few instances below) though we can update these in a separate change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change this pending @grypp 's thoughts on whether to make this a NFC PR or whether to include those changes in here.
Doc updates LGTM except for a few minor asks. |
ok, latest updates LGTM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (we can move the type changes to a separate PR)
I don't have access to merge the PR myself. Could someone give me access (or do it for me)? For that matter I can't start the builders either. |
@smada3 Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
Several operations in the NVVM dialect were missing summaries and descriptions. This PR
adds summaries and descriptions for the following operations:
Documentation available here: mbarrier, barrier