Skip to content

[MLIR][NVVM] Add pmevent #152509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 8, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,44 @@ def NVVM_ReduxOp :
}];
}

//===----------------------------------------------------------------------===//
// NVVM Performance Monitor events
//===----------------------------------------------------------------------===//

def NVVM_PMEventOp : NVVM_PTXBuilder_Op<"pmevent">,
Arguments<(ins OptionalAttr<I16Attr>:$maskedEventId,
OptionalAttr<I32Attr>:$eventId)> {
let summary = "Trigger one or more Performance Monitor events.";

let description = [{
Triggers one or more of a fixed number of performance monitor events, with
event index or mask specified by immediate operand.

Without `mask` it triggers a single performance monitor event indexed by
immediate operand a, in the range 0..15.

With `mask` it triggers one or more of the performance monitor events. Each
bit in the 16-bit immediate operand controls an event.

[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent)
}];

string llvmBuilder = [{
llvm::Value *mId = builder.getInt16(* $maskedEventId);
createIntrinsicCall(builder, llvm::Intrinsic::nvvm_pm_event_mask, {mId});
}];

let assemblyFormat = "attr-dict (`id` `=` $eventId^)? (`mask` `=` $maskedEventId^)?";

let extraClassDeclaration = [{
bool hasIntrinsic() { return !getEventId(); }
}];
let extraClassDefinition = [{
std::string $cppClass::getPtx() { return std::string("pmevent %0;"); }
}];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have the intrinsics now, do we still need the inline-asm version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've intrinsic for the mask version. This is without mask. I think we need to add another intrinsic

pmevent       a;    // trigger a single performance monitor event
pmevent.mask  a;    // trigger one or more performance monitor events

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding two intrinsics but was then asked to have only the mask-based one.

In final sass, it is always mask-based and the event-id impl seems a syntactic sugar from PTX.
So, we could take the event-id and shift it to generate the mask to always lower to the the mask-based intrinsic.

let hasVerifier = 1;
}

//===----------------------------------------------------------------------===//
// NVVM Split arrive/wait barrier
//===----------------------------------------------------------------------===//
Expand Down
20 changes: 20 additions & 0 deletions mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,26 @@ LogicalResult BulkStoreOp::verify() {
return success();
}

LogicalResult PMEventOp::verify() {
auto eventId = getEventId();
auto maskedEventId = getMaskedEventId();
if (!maskedEventId && !eventId) {
return emitOpError() << "either `id` or `mask` must be set";
}

if (maskedEventId && eventId) {
return emitOpError() << "`id` and `mask` cannot be set at the same time";
}

if (eventId) {
if (eventId < 0 || eventId > 15) {
return emitOpError() << "`id` must be between 0 and 15";
}
}

return llvm::success();
}

// Given the element type of an operand and whether or not it is an accumulator,
// this function returns the PTX type (`NVVM::MMATypes`) that corresponds to the
// operand's element type.
Expand Down
14 changes: 14 additions & 0 deletions mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -681,3 +681,17 @@ llvm.func @ex2(%input : f32, %pred : i1) {
%1 = nvvm.inline_ptx "ex2.approx.ftz.f32 $0, $1;" (%input), predicate = %pred : f32, i1 -> f32
llvm.return
}

// -----

// CHECK-LABEL: @nvvm_pmevent
llvm.func @nvvm_pmevent() {
// CHECK: %[[S0:.+]] = llvm.mlir.constant(10 : i32) : i32
// CHECK: llvm.inline_asm has_side_effects asm_dialect = att "pmevent $0;", "n" %[[S0]] : (i32) -> ()

nvvm.pmevent id = 10
// CHECK: %[[S1:.+]] = llvm.mlir.constant(4 : i32) : i32
// CHECK: llvm.inline_asm has_side_effects asm_dialect = att "pmevent $0;", "n" %[[S1]] : (i32) -> ()
nvvm.pmevent id = 4
llvm.return
}
19 changes: 19 additions & 0 deletions mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
// RUN: mlir-translate -verify-diagnostics -split-input-file -mlir-to-llvmir %s

llvm.func @pmevent_no_id() {
// expected-error @below {{either `id` or `mask` must be set}}
nvvm.pmevent
}

// -----

llvm.func @pmevent_bigger15() {
// expected-error @below {{`id` must be between 0 and 15}}
nvvm.pmevent id = 141
}

// -----

llvm.func @pmevent_many_ids() {
// expected-error @below {{`id` and `mask` cannot be set at the same time}}
nvvm.pmevent id = 1 mask = 1
}

// -----

llvm.func @kernel_func(%numberOfThreads : i32) {
Expand Down
11 changes: 11 additions & 0 deletions mlir/test/Target/LLVMIR/nvvmir.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -918,3 +918,14 @@ llvm.func @nvvm_dot_accumulate_2way(%a: vector<2xi16>, %b: vector<4xi8>, %c: i32
%7 = nvvm.dot.accumulate.2way %a <signed>, %b <signed>, %c {b_hi = true}: vector<2xi16>, vector<4xi8>
llvm.return
}

// -----

// CHECK-LABEL: @nvvm_pmevent
llvm.func @nvvm_pmevent() {
// CHECK: call void @llvm.nvvm.pm.event.mask(i16 15000)
nvvm.pmevent mask = 15000
// CHECK: call void @llvm.nvvm.pm.event.mask(i16 4)
nvvm.pmevent mask = 4
llvm.return
}