-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[MLIR][NVVM] Add pmevent #152509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][NVVM] Add pmevent #152509
Conversation
@llvm/pr-subscribers-mlir Author: Guray Ozen (grypp) ChangesAdd nvvm.pmevent Op that Triggers one or more of a fixed number of performance monitor events, with event index or mask specified by immediate operand. For more information, see PTX ISA Full diff: https://github.com/llvm/llvm-project/pull/152509.diff 5 Files Affected:
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 30df3b739e5ca..df94f95ced262 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -401,6 +401,44 @@ def NVVM_ReduxOp :
}];
}
+//===----------------------------------------------------------------------===//
+// NVVM Performance Monitor events
+//===----------------------------------------------------------------------===//
+
+def NVVM_PMEventOp : NVVM_PTXBuilder_Op<"pmevent">,
+ Arguments<(ins OptionalAttr<I16Attr>:$maskedEventId,
+ OptionalAttr<I32Attr>:$eventId)> {
+ let summary = "Trigger one or more Performance Monitor events.";
+
+ let description = [{
+ Triggers one or more of a fixed number of performance monitor events, with
+ event index or mask specified by immediate operand.
+
+ Without `mask` it triggers a single performance monitor event indexed by
+ immediate operand a, in the range 0..15.
+
+ With `mask` it triggers one or more of the performance monitor events. Each
+ bit in the 16-bit immediate operand a controls an event.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent)
+ }];
+
+ string llvmBuilder = [{
+ llvm::Value *mId = builder.getInt16(* $maskedEventId);
+ createIntrinsicCall(builder, llvm::Intrinsic::nvvm_pm_event_mask, {mId});
+ }];
+
+ let assemblyFormat = "attr-dict (`id` `=` $eventId^)? (`mask` `=` $maskedEventId^)?";
+
+ let extraClassDeclaration = [{
+ bool hasIntrinsic() { if(getEventId()) return false; return true; }
+ }];
+ let extraClassDefinition = [{
+ std::string $cppClass::getPtx() { return std::string("pmevent %0;"); }
+ }];
+ let hasVerifier = 1;
+}
+
//===----------------------------------------------------------------------===//
// NVVM Split arrive/wait barrier
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
index e0977f5b616c1..ab0b0d3f754fe 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
@@ -189,6 +189,24 @@ LogicalResult BulkStoreOp::verify() {
return success();
}
+LogicalResult PMEventOp::verify() {
+ if (!getMaskedEventId() && !getEventId()) {
+ return emitOpError() << "either `id` or `mask` must be set";
+ }
+
+ if (getMaskedEventId() && getEventId()) {
+ return emitOpError() << "`id` and `mask` cannot be set at the same time";
+ }
+
+ if (getEventId()) {
+ if (getEventId() < 0 || getEventId() > 15) {
+ return emitOpError() << "`id` must be between 0 and 15";
+ }
+ }
+
+ return llvm::success();
+}
+
// Given the element type of an operand and whether or not it is an accumulator,
// this function returns the PTX type (`NVVM::MMATypes`) that corresponds to the
// operand's element type.
diff --git a/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir b/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
index 580b09d70c480..e50576722e38c 100644
--- a/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
+++ b/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
@@ -681,3 +681,17 @@ llvm.func @ex2(%input : f32, %pred : i1) {
%1 = nvvm.inline_ptx "ex2.approx.ftz.f32 $0, $1;" (%input), predicate = %pred : f32, i1 -> f32
llvm.return
}
+
+// -----
+
+// CHECK-LABEL: @nvvm_pmevent
+llvm.func @nvvm_pmevent() {
+ // CHECK: %[[S0:.+]] = llvm.mlir.constant(10 : i32) : i32
+ // CHECK: llvm.inline_asm has_side_effects asm_dialect = att "pmevent $0;", "n" %[[S0]] : (i32) -> ()
+
+ nvvm.pmevent id = 10
+ // CHECK: %[[S1:.+]] = llvm.mlir.constant(4 : i32) : i32
+ // CHECK: llvm.inline_asm has_side_effects asm_dialect = att "pmevent $0;", "n" %[[S1]] : (i32) -> ()
+ nvvm.pmevent id = 4
+ llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir b/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
index 85478cc160064..991222ca29127 100644
--- a/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
+++ b/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
@@ -1,5 +1,24 @@
// RUN: mlir-translate -verify-diagnostics -split-input-file -mlir-to-llvmir %s
+llvm.func @pmevent_no_id() {
+ // expected-error @below {{either `id` or `mask` must be set}}
+ nvvm.pmevent
+}
+
+// -----
+
+llvm.func @pmevent_bigger15() {
+ // expected-error @below {{`id` must be between 0 and 15}}
+ nvvm.pmevent id = 141
+}
+
+// -----
+
+llvm.func @pmevent_many_ids() {
+ // expected-error @below {{`id` and `mask` cannot be set at the same time}}
+ nvvm.pmevent id = 1 mask = 1
+}
+
// -----
llvm.func @kernel_func(%numberOfThreads : i32) {
diff --git a/mlir/test/Target/LLVMIR/nvvmir.mlir b/mlir/test/Target/LLVMIR/nvvmir.mlir
index 5c2cfa4683104..1600216e95c87 100644
--- a/mlir/test/Target/LLVMIR/nvvmir.mlir
+++ b/mlir/test/Target/LLVMIR/nvvmir.mlir
@@ -918,3 +918,14 @@ llvm.func @nvvm_dot_accumulate_2way(%a: vector<2xi16>, %b: vector<4xi8>, %c: i32
%7 = nvvm.dot.accumulate.2way %a <signed>, %b <signed>, %c {b_hi = true}: vector<2xi16>, vector<4xi8>
llvm.return
}
+
+// -----
+
+// CHECK-LABEL: @nvvm_pmevent
+llvm.func @nvvm_pmevent() {
+ // CHECK: call void @llvm.nvvm.pm.event.mask(i16 15000)
+ nvvm.pmevent mask = 15000
+ // CHECK: call void @llvm.nvvm.pm.event.mask(i16 4)
+ nvvm.pmevent mask = 4
+ llvm.return
+}
\ No newline at end of file
|
@llvm/pr-subscribers-mlir-llvm Author: Guray Ozen (grypp) ChangesAdd nvvm.pmevent Op that Triggers one or more of a fixed number of performance monitor events, with event index or mask specified by immediate operand. For more information, see PTX ISA Full diff: https://github.com/llvm/llvm-project/pull/152509.diff 5 Files Affected:
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 30df3b739e5ca..df94f95ced262 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -401,6 +401,44 @@ def NVVM_ReduxOp :
}];
}
+//===----------------------------------------------------------------------===//
+// NVVM Performance Monitor events
+//===----------------------------------------------------------------------===//
+
+def NVVM_PMEventOp : NVVM_PTXBuilder_Op<"pmevent">,
+ Arguments<(ins OptionalAttr<I16Attr>:$maskedEventId,
+ OptionalAttr<I32Attr>:$eventId)> {
+ let summary = "Trigger one or more Performance Monitor events.";
+
+ let description = [{
+ Triggers one or more of a fixed number of performance monitor events, with
+ event index or mask specified by immediate operand.
+
+ Without `mask` it triggers a single performance monitor event indexed by
+ immediate operand a, in the range 0..15.
+
+ With `mask` it triggers one or more of the performance monitor events. Each
+ bit in the 16-bit immediate operand a controls an event.
+
+ [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent)
+ }];
+
+ string llvmBuilder = [{
+ llvm::Value *mId = builder.getInt16(* $maskedEventId);
+ createIntrinsicCall(builder, llvm::Intrinsic::nvvm_pm_event_mask, {mId});
+ }];
+
+ let assemblyFormat = "attr-dict (`id` `=` $eventId^)? (`mask` `=` $maskedEventId^)?";
+
+ let extraClassDeclaration = [{
+ bool hasIntrinsic() { if(getEventId()) return false; return true; }
+ }];
+ let extraClassDefinition = [{
+ std::string $cppClass::getPtx() { return std::string("pmevent %0;"); }
+ }];
+ let hasVerifier = 1;
+}
+
//===----------------------------------------------------------------------===//
// NVVM Split arrive/wait barrier
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
index e0977f5b616c1..ab0b0d3f754fe 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
@@ -189,6 +189,24 @@ LogicalResult BulkStoreOp::verify() {
return success();
}
+LogicalResult PMEventOp::verify() {
+ if (!getMaskedEventId() && !getEventId()) {
+ return emitOpError() << "either `id` or `mask` must be set";
+ }
+
+ if (getMaskedEventId() && getEventId()) {
+ return emitOpError() << "`id` and `mask` cannot be set at the same time";
+ }
+
+ if (getEventId()) {
+ if (getEventId() < 0 || getEventId() > 15) {
+ return emitOpError() << "`id` must be between 0 and 15";
+ }
+ }
+
+ return llvm::success();
+}
+
// Given the element type of an operand and whether or not it is an accumulator,
// this function returns the PTX type (`NVVM::MMATypes`) that corresponds to the
// operand's element type.
diff --git a/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir b/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
index 580b09d70c480..e50576722e38c 100644
--- a/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
+++ b/mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
@@ -681,3 +681,17 @@ llvm.func @ex2(%input : f32, %pred : i1) {
%1 = nvvm.inline_ptx "ex2.approx.ftz.f32 $0, $1;" (%input), predicate = %pred : f32, i1 -> f32
llvm.return
}
+
+// -----
+
+// CHECK-LABEL: @nvvm_pmevent
+llvm.func @nvvm_pmevent() {
+ // CHECK: %[[S0:.+]] = llvm.mlir.constant(10 : i32) : i32
+ // CHECK: llvm.inline_asm has_side_effects asm_dialect = att "pmevent $0;", "n" %[[S0]] : (i32) -> ()
+
+ nvvm.pmevent id = 10
+ // CHECK: %[[S1:.+]] = llvm.mlir.constant(4 : i32) : i32
+ // CHECK: llvm.inline_asm has_side_effects asm_dialect = att "pmevent $0;", "n" %[[S1]] : (i32) -> ()
+ nvvm.pmevent id = 4
+ llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir b/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
index 85478cc160064..991222ca29127 100644
--- a/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
+++ b/mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
@@ -1,5 +1,24 @@
// RUN: mlir-translate -verify-diagnostics -split-input-file -mlir-to-llvmir %s
+llvm.func @pmevent_no_id() {
+ // expected-error @below {{either `id` or `mask` must be set}}
+ nvvm.pmevent
+}
+
+// -----
+
+llvm.func @pmevent_bigger15() {
+ // expected-error @below {{`id` must be between 0 and 15}}
+ nvvm.pmevent id = 141
+}
+
+// -----
+
+llvm.func @pmevent_many_ids() {
+ // expected-error @below {{`id` and `mask` cannot be set at the same time}}
+ nvvm.pmevent id = 1 mask = 1
+}
+
// -----
llvm.func @kernel_func(%numberOfThreads : i32) {
diff --git a/mlir/test/Target/LLVMIR/nvvmir.mlir b/mlir/test/Target/LLVMIR/nvvmir.mlir
index 5c2cfa4683104..1600216e95c87 100644
--- a/mlir/test/Target/LLVMIR/nvvmir.mlir
+++ b/mlir/test/Target/LLVMIR/nvvmir.mlir
@@ -918,3 +918,14 @@ llvm.func @nvvm_dot_accumulate_2way(%a: vector<2xi16>, %b: vector<4xi8>, %c: i32
%7 = nvvm.dot.accumulate.2way %a <signed>, %b <signed>, %c {b_hi = true}: vector<2xi16>, vector<4xi8>
llvm.return
}
+
+// -----
+
+// CHECK-LABEL: @nvvm_pmevent
+llvm.func @nvvm_pmevent() {
+ // CHECK: call void @llvm.nvvm.pm.event.mask(i16 15000)
+ nvvm.pmevent mask = 15000
+ // CHECK: call void @llvm.nvvm.pm.event.mask(i16 4)
+ nvvm.pmevent mask = 4
+ llvm.return
+}
\ No newline at end of file
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for the NVVM pmevent
instruction that triggers performance monitor events. The implementation includes both intrinsic-based (for masked events) and inline assembly-based (for individual events) code generation paths.
- Introduces
nvvm.pmevent
operation with support for both individual event IDs (0-15) and event masks - Adds verification logic to ensure proper parameter usage (exactly one of
id
ormask
must be specified) - Includes comprehensive test coverage for both valid usage and error conditions
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | Defines the PMEventOp operation with attributes, assembly format, and code generation logic |
mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp | Implements verification logic for PMEventOp parameter validation |
mlir/test/Target/LLVMIR/nvvmir.mlir | Tests LLVM IR generation for mask-based pmevent operations |
mlir/test/Target/LLVMIR/nvvmir-invalid.mlir | Tests error handling for invalid parameter combinations |
mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir | Tests inline assembly generation for id-based pmevent operations |
}]; | ||
let extraClassDefinition = [{ | ||
std::string $cppClass::getPtx() { return std::string("pmevent %0;"); } | ||
}]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have the intrinsics now, do we still need the inline-asm version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've intrinsic for the mask version. This is without mask. I think we need to add another intrinsic
pmevent a; // trigger a single performance monitor event
pmevent.mask a; // trigger one or more performance monitor events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried adding two intrinsics but was then asked to have only the mask-based one.
In final sass, it is always mask-based and the event-id impl seems a syntactic sugar from PTX.
So, we could take the event-id and shift it to generate the mask to always lower to the the mask-based intrinsic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for a few nits.
The bot suggestion in a few places looks good to incorporate.
Add nvvm.pmevent Op that Triggers one or more of a fixed number of performance monitor events, with event index or mask specified by immediate operand. [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a minor nit, LGTM!
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/17283 Here is the relevant piece of the build log for the reference
|
Add nvvm.pmevent Op that Triggers one or more of a fixed number of performance monitor events, with event index or mask specified by immediate operand.
For more information, see PTX ISA