-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[flang-rt] Add Assign_omp RT call. #145465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,77 @@ | ||||||
//===-- lib/runtime/assign_omp.cpp ----------------------------------*- C++ | ||||||
//-*-===// | ||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Format |
||||||
// | ||||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||||||
// See https://llvm.org/LICENSE.txt for license information. | ||||||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||||||
// | ||||||
//===----------------------------------------------------------------------===// | ||||||
|
||||||
#include "flang-rt/runtime/assign-impl.h" | ||||||
#include "flang-rt/runtime/derived.h" | ||||||
#include "flang-rt/runtime/descriptor.h" | ||||||
#include "flang-rt/runtime/stat.h" | ||||||
#include "flang-rt/runtime/terminator.h" | ||||||
#include "flang-rt/runtime/tools.h" | ||||||
#include "flang-rt/runtime/type-info.h" | ||||||
#include "flang/Runtime/assign.h" | ||||||
|
||||||
#include <omp.h> | ||||||
|
||||||
namespace Fortran::runtime { | ||||||
namespace omp { | ||||||
|
||||||
typedef int32_t OMPDeviceTy; | ||||||
|
||||||
template <typename T> static T *getDevicePtr(T *anyPtr, OMPDeviceTy ompDevice) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: for easier reasability:
Suggested change
|
||||||
auto voidAnyPtr = reinterpret_cast<void *>(anyPtr); | ||||||
// If not present on the device it should already be a device ptr | ||||||
if (!omp_target_is_present(voidAnyPtr, ompDevice)) | ||||||
return anyPtr; | ||||||
T *device_ptr = omp_get_mapped_ptr(anyPtr, ompDevice); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: use the same style for variables names: |
||||||
return device_ptr; | ||||||
} | ||||||
|
||||||
RT_API_ATTRS static void Assign(Descriptor &to, const Descriptor &from, | ||||||
Terminator &terminator, int flags, OMPDeviceTy omp_device) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
std::size_t toElementBytes{to.ElementBytes()}; | ||||||
std::size_t fromElementBytes{from.ElementBytes()}; | ||||||
std::size_t toElements{to.Elements()}; | ||||||
std::size_t fromElements{from.Elements()}; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You want to check also that descriptors are contiguous. You can have the same number of elements but the stride might be different. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you want it to work also on non contiguous descriptors, the Assign function as a mechanism to pass memmove function to use. |
||||||
|
||||||
if (toElementBytes != fromElementBytes) | ||||||
terminator.Crash("Assign: toElementBytes != fromElementBytes"); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Logging the number of element bytes (and elements below) in the crash might be helpful when debugging. |
||||||
if (toElements != fromElements) | ||||||
terminator.Crash("Assign: toElements != fromElements"); | ||||||
|
||||||
// Get base addresses and calculate length | ||||||
void *to_base = to.raw().base_addr; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Naming style is different in this function as well. |
||||||
void *from_base = from.raw().base_addr; | ||||||
size_t length = toElements * toElementBytes; | ||||||
|
||||||
// Get device pointers after ensuring data is on device | ||||||
void *to_ptr = getDevicePtr(to_base, omp_device); | ||||||
void *from_ptr = getDevicePtr(from_base, omp_device); | ||||||
|
||||||
// Perform copy between device pointers | ||||||
int result = omp_target_memcpy(to_ptr, from_ptr, length, | ||||||
/*dst_offset*/ 0, /*src_offset*/ 0, omp_device, omp_device); | ||||||
|
||||||
if (result != 0) | ||||||
terminator.Crash("Assign: omp_target_memcpy failed"); | ||||||
return; | ||||||
} | ||||||
|
||||||
extern "C" { | ||||||
RT_EXT_API_GROUP_BEGIN | ||||||
void RTDEF(Assign_omp)(Descriptor &to, const Descriptor &from, | ||||||
const char *sourceFile, int sourceLine, omp::OMPDeviceTy omp_device) { | ||||||
Terminator terminator{sourceFile, sourceLine}; | ||||||
Fortran::runtime::omp::Assign(to, from, terminator, | ||||||
MaybeReallocate | NeedFinalization | ComponentCanBeDefinedAssignment, | ||||||
omp_device); | ||||||
} | ||||||
|
||||||
} // extern "C" | ||||||
} // namespace omp | ||||||
} // namespace Fortran::runtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's a good idea to make the Fortran runtime depend on the OpenMP runtime library. I think it makes more sense to have this routine live in the OpenMP offload runtime as a potentially generic API entry point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fortran runtime function implementations exist in the OpenMP runtime library like this: https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/kmp_ftn_entry.h . If it is offload-only, it may also live in libomptarget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context:
Assign_omp just does omp_target_memcpy between two device pointers.
void RTDEF(Assign_omp)(Descriptor &to, const Descriptor &from, const char *sourceFile, int sourceLine, omp::OMPDeviceTy omp_device)
This api is required, when hoisting "fir.call @_FortranAAssign(...)" from omp.target to the host in "lower-workdistribute" pass #140523
Descriptor struct is defined in flang-rt/runtime/descriptor.h.
Issue:
Now if I need to move this implementation to openmp runtime or libomptarget, I wouldn't have access to fortran-rt Descriptor structure there. Is there any solution to deal with such issue?
Probable solution:
Instead of creating new runtime api, may be, make call to "omp_target_memcpy" directly by extracting the ptrs from Descriptor structure at MLIR stage? Need to check if this is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see two options:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions @mjklemm
Have tried second approach of adding Descriptor accessors in flang-rt. Draft PR #152756 is under progress.
Working on adding API in openmp runtime to do omp_target_memcpy between two device pointers.