Skip to content

[mlir][rocdl] Add lowering of math.clampf -> rocdl.med3 #157052

@Groverkss

Description

@Groverkss

#151153 Introduced a new math.clampf op which has an efficient lowering on amdgpu hardware (atleast on gfx9+) as V_MED3_F16 (f16 for example, there are other types supported).

We should add support for this by introducing a new rocdl.med3 op and lowering math.clampf to it.

Example of the op being used in composable kernels for clamping before f16 -> f8 conversion: https://github.com/ROCm/composable_kernel/blob/83f607e2a68d778479ddd34fd5aac3d4e7c9e52f/include/ck/utility/amd_ck_fp8.hpp#L651

The conversion should look like:

rocdl.med3.f16
rocdl.med3.f32
rocdl.med3.i16
rocdl.med3.i32

^
|

amdgpu.med3

^
|

math.clampf

This would probably be 3 prs:

  1. Add rocdl ops and conversion to LLVM
  2. Add amdgpu.med3 and lower it to rocdl ops
  3. Add math.clampf -> amdgpu.med3 lowering

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions