You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[OpenMP][Offload] Handle for non-memberof present/to/from entries irrespective of order.
For cases like:
```c
map(alloc: x) map(to: x)
```
If the entry of `map(to: x)` is encountered after the entry for
`map(alloc:x)`, we still want to do a data-transfer even though the
ref-count of `x` was already 0, because the new allocation for `x`
happened as part of the current directive.
Similarly, for:
```c
... map(alloc: x) map(from: x)
```
If the entry for `map(from:x)` is encountered before the entry for
`map(alloc:x)`, we want to do a data-transfer even though the
ref-count was not 0 when looking at the `from` entry, because by the end of
the directive, the ref-count of `x` will go down to zero.
And for:
```c
... map(from : x) map(alloc, present: x)
```
If the "present" entry is encountered after the "from" entry, then it becomes
a no-op, as the "from" entry will do an allocation if no match was found.
In this PR, these are handled by the runtime via the following:
* For `to` and `present`, we also look-up in the existing table where we tracked
new allocations when making the decision for the entry.
* For `from`, we keep track of any deferred data transfers and when the
ref-count of a pointer goes to zero, see if there were any previously
deferred `from` transfers for that pointer.
This can be done in the compiler, and that would avoid any runtime
overhead, but it would require creating two separate offload struct entries
for the entry and exit mappings (even for the `target` construct),
with properly decayed maps, and either:
(1) sorted in order of:
* `present > to > ...` for the implied `target enter data`; and
* `from > ...` for the `target exit data`
e.g.
```c
#pragma omp target map(to: x) map(present, alloc: x) map(always, from: x)
// has to be broken into:
// from becomes alloc on entry:
// #pragma omp target enter data map(present, alloc: x)
// map(to: x)
// map(alloc: x)
//
// "present" and "to" just "decay" into "alloc"
// #pragma omp target exit data map(always, from: x)
// map(alloc: x)
// map(alloc: x)
```
Or,
(2) Merged into one entry each on the `target enter/exit data`
directives.
```c
#pragma omp target map(to: x) map(present, alloc: x) map(always, from: x)
// has to be broken into:
// from becomes alloc on entry:
// #pragma omp target enter data map(present, to: x)
//
// "present" and "to" just "decay" into "alloc"
// #pragma omp target exit data map(always, from: x)
```
The number of entries on the two would need to stay the same on the two to avoid
ref-count mismatch.
(1) would be simpler, but won't likely work for cases like:
```c
... map(delete: x) map(from:x)
```
as there is no clear "winner" between the two. So, for such cases, the compiler
would likely have to do (2), which is the cleanest solution, but will take
longer to implement. For EXPR comparisons, it can build-upon the
`AttachPtrExprComparator` that was implemented as part of #153683,
but that should probably wait for the PR to be merged to avoid
conflicts.
Another alternative is to sort the entries in the runtime, which may be
slower than on-demand lookups/updates that this PR does, because we
always would be doing this sorting even when not needed, but may be
faster in others where the constant-time overhead of map/set
insertions/lookups becomes too large because of the number of maps. But
that will still have to worry about the `from` + `delete` case.
0 commit comments