Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions flang-rt/lib/cuda/memory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -110,14 +110,12 @@ void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc,
dstDesc->ApplyMold(*srcDesc, dstDesc->rank());
dstDesc->Allocate(/*asyncObject=*/nullptr);
}
if ((srcDesc->rank() > 0) && (dstDesc->Elements() < srcDesc->Elements())) {
if ((srcDesc->rank() > 0) && (dstDesc->Elements() <= srcDesc->Elements()) &&
srcDesc->IsContiguous() && dstDesc->IsContiguous()) {
// Special case when rhs is bigger than lhs and both are contiguous arrays.
// In this case we do a simple ptr to ptr transfer with the size of lhs.
// This is be allowed in the reference compiler and it avoids error
// triggered in the Assign runtime function used for the main case below.
if (!srcDesc->IsContiguous() || !dstDesc->IsContiguous())
terminator.Crash("Unsupported data transfer: mismatching element counts "
"with non-contiguous arrays");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding correct that both new and old version achieve the same goal of only allowing contiguous arrays for the data transfer, but the old version gives an error message about unsupported data transfer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see there is a difference between <= and <. Didn't notice it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and the new version fall back on the runtime assign function instead of giving an error.

RTNAME(CUFDataTransferPtrPtr)(dstDesc->raw().base_addr,
srcDesc->raw().base_addr, dstDesc->Elements() * dstDesc->ElementBytes(),
mode, sourceFile, sourceLine);
Expand Down
Loading