[UR][hip][opencl] Mark urKernelSuggestMaxCooperativeGroupCountExp as unsupported instead of returning misleading default value #2038

GeorgeWeb · 2024-08-30T11:57:27Z

frasercrmck

Makes sense to me

source/adapters/opencl/kernel.cpp

…unsupported instead of returning misleading default value

…xceeded launch limits on more backends (hip and opencl) (#15369) The HIP and OpenCL backend implementations of the query had a default return `1` group implementation, which is an incorrect assumptions. They will now be marked as _Unsupported_ with the accompanying UR chagnes (see oneapi-src/unified-runtime#2038), so for these cases the `kernel_queue_specific::max_num_work_groups` launch query will rely on the fallback that returns either `1` or `0` groups based on hardware resource limitation checks for the kernel. --------- Co-authored-by: Aaron Greig <[email protected]>

jinz2014 · 2025-01-22T18:04:17Z

@GeorgeWeb
Could you please explain that it is not supported on the HIP backend ?

jinz2014 · 2025-01-22T18:07:11Z

What about the following function for HIP ?

UR_APIEXPORT ur_result_t UR_APICALL urKernelSuggestMaxCooperativeGroupCountExp(
    ur_kernel_handle_t hKernel, ur_device_handle_t hDevice, uint32_t workDim,
    const size_t *pLocalWorkSize, size_t dynamicSharedMemorySize,
    uint32_t *pGroupCountRet) {
  UR_ASSERT(hKernel, UR_RESULT_ERROR_INVALID_KERNEL);

  std::ignore = hDevice;

  size_t localWorkSize = pLocalWorkSize[0];
  localWorkSize *= (workDim >= 2 ? pLocalWorkSize[1] : 1);
  localWorkSize *= (workDim == 3 ? pLocalWorkSize[2] : 1);

  // We need to set the active current device for this kernel explicitly here,
  // because the occupancy querying API does not take device parameter.
  ur_device_handle_t Device = hKernel->getProgram()->getDevice();
  ScopedContext Active(Device);
  try {
    // We need to calculate max num of work-groups using per-device semantics.

    int MaxNumActiveGroupsPerCU{0};
    UR_CHECK_ERROR(hipOccupancyMaxActiveBlocksPerMultiprocessor(
        &MaxNumActiveGroupsPerCU, hKernel->get(), localWorkSize,
        dynamicSharedMemorySize));
    detail::ur::assertion(MaxNumActiveGroupsPerCU >= 0);
    // Handle the case where we can't have all SMs active with at least 1 group
    // per SM. In that case, the device is still able to run 1 work-group, hence
    // we will manually check if it is possible with the available HW resources.
    if (MaxNumActiveGroupsPerCU == 0) {
      size_t MaxWorkGroupSize{};
      urKernelGetGroupInfo(
          hKernel, Device, UR_KERNEL_GROUP_INFO_WORK_GROUP_SIZE,
          sizeof(MaxWorkGroupSize), &MaxWorkGroupSize, nullptr);
      size_t MaxLocalSizeBytes{};
      urDeviceGetInfo(Device, UR_DEVICE_INFO_LOCAL_MEM_SIZE,
                      sizeof(MaxLocalSizeBytes), &MaxLocalSizeBytes, nullptr);
      if (localWorkSize > MaxWorkGroupSize ||
          dynamicSharedMemorySize > MaxLocalSizeBytes)
        *pGroupCountRet = 0;
      else
        *pGroupCountRet = 1;
    } else {
      // Multiply by the number of SMs (CUs = compute units) on the device in
      // order to retreive the total number of groups/blocks that can be
      // launched.
      *pGroupCountRet = Device->getNumComputeUnits() * MaxNumActiveGroupsPerCU;
    }
  } catch (ur_result_t Err) {
    return Err;
  }
  return UR_RESULT_SUCCESS;
}

GeorgeWeb · 2025-01-23T00:48:23Z

Could you please explain that it is not supported on the HIP backend ?

Hi @jinz2014.
It isn't unsupported by the HIP runtime itself as the required API exists, but we just haven't implemented it in the HIP adapter of Unified Runtime yet. This is mostly due to that the functionality was requested for Cuda targets at the time.

You are more than welcome to open a Github issue for the team in case they may have the bandwidth for this or straight up a PR request for HIP with the implementation you suggested in your other comment.

Thank you for being proactive on the project!

github-actions bot added the hip HIP adapter specific issues label Aug 30, 2024

GeorgeWeb added the opencl OpenCL adapter specific issues label Sep 4, 2024

GeorgeWeb marked this pull request as ready for review September 11, 2024 14:59

GeorgeWeb requested review from a team as code owners September 11, 2024 14:59

GeorgeWeb requested a review from jchlanda September 11, 2024 14:59

frasercrmck approved these changes Sep 11, 2024

View reviewed changes

jchlanda approved these changes Sep 12, 2024

View reviewed changes

GeorgeWeb force-pushed the georgi/unsupported-max-coop-wgsize branch 2 times, most recently from 609cae5 to 6eb27d5 Compare September 13, 2024 12:49

GeorgeWeb mentioned this pull request Sep 13, 2024

[SYCL] Enable checking the result of max_num_work_groups query with exceeded launch limits on more backends (hip and opencl) intel/llvm#15369

Merged

GeorgeWeb force-pushed the georgi/unsupported-max-coop-wgsize branch 4 times, most recently from ca55cef to 73ca483 Compare September 20, 2024 13:04

aarongreig reviewed Sep 24, 2024

View reviewed changes

source/adapters/opencl/kernel.cpp Outdated Show resolved Hide resolved

GeorgeWeb force-pushed the georgi/unsupported-max-coop-wgsize branch 3 times, most recently from fa515a0 to ed6fbad Compare September 27, 2024 15:20

aarongreig approved these changes Sep 27, 2024

View reviewed changes

[UR][hip][opencl] Mark urKernelSuggestMaxCooperativeGroupCountExp as …

55bd563

…unsupported instead of returning misleading default value

GeorgeWeb force-pushed the georgi/unsupported-max-coop-wgsize branch from ed6fbad to 55bd563 Compare October 2, 2024 11:20

aarongreig added the ready to merge Added to PR's which are ready to merge label Oct 2, 2024

aarongreig merged commit df6da35 into oneapi-src:main Oct 7, 2024
72 of 74 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[UR][hip][opencl] Mark urKernelSuggestMaxCooperativeGroupCountExp as unsupported instead of returning misleading default value #2038

[UR][hip][opencl] Mark urKernelSuggestMaxCooperativeGroupCountExp as unsupported instead of returning misleading default value #2038

Uh oh!

GeorgeWeb commented Aug 30, 2024 •

edited

Loading

Uh oh!

frasercrmck left a comment

Uh oh!

Uh oh!

Uh oh!

jinz2014 commented Jan 22, 2025

Uh oh!

jinz2014 commented Jan 22, 2025

Uh oh!

GeorgeWeb commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[UR][hip][opencl] Mark urKernelSuggestMaxCooperativeGroupCountExp as unsupported instead of returning misleading default value #2038

[UR][hip][opencl] Mark urKernelSuggestMaxCooperativeGroupCountExp as unsupported instead of returning misleading default value #2038

Uh oh!

Conversation

GeorgeWeb commented Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frasercrmck left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jinz2014 commented Jan 22, 2025

Uh oh!

jinz2014 commented Jan 22, 2025

Uh oh!

GeorgeWeb commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

GeorgeWeb commented Aug 30, 2024 •

edited

Loading