Skip to content

Conversation

@masterleinad
Copy link
Collaborator

@masterleinad masterleinad commented Sep 9, 2022

The strategy here is to do a regular query that gets us the MPI ranks, and primitive indices for each query, send them back to the MPI rank owning the primitives, execute the callback, and send them back to the MPI rank owning the queries.

@masterleinad masterleinad force-pushed the distributed_knn_callback branch from 00d1ef2 to c74aa9e Compare September 9, 2022 19:55
@masterleinad
Copy link
Collaborator Author

This should work now replaying the found matches on the remote MPI rank and communicating the results back. Currently, I don't treat the default callback special to validate that this behaves as expected.

@masterleinad
Copy link
Collaborator Author

This should be ready for inital review. As said before, the strategy here is to do a regular query that get's us the MPI ranks, and primitive indices for each query, send them back to the MPI rank owning the primitives, execute the callback, and send them back to the MPI rank owning the queries.

@masterleinad masterleinad marked this pull request as ready for review September 15, 2022 15:07
@masterleinad
Copy link
Collaborator Author

I will still adopt some more tests to use the callback.

@masterleinad
Copy link
Collaborator Author

@aprokop ping

@aprokop aprokop added the enhancement New feature or request label Sep 22, 2022
{
auto data = ArborX::getData(predicate);
auto const &point = points(primitive_index);
printf("Intersection for query %d from MPI rank %d on MPI rank %d for "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to guard it with SYCL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like our saying "intersection". Commenting so we can get back to it.

Comment on lines 30 to 31
using PairIndexRank = Kokkos::pair<int, int>;
using PairRankIndex = Kokkos::pair<int, int>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely not. Aliasing Kokkos::pair was (probably me) already being lazy, this does not protect us against bugs and is awfully confusing.
Any good reason to have both PairIndexRank and PairRankIndex?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before, I need to work more on the tests. Here, I wanted the reverse order to make sure that the new test really executes the callback version properly. I would have gotten the same result ignoring the callback otherwise.

@masterleinad masterleinad force-pushed the distributed_knn_callback branch from c47b028 to 9402639 Compare October 11, 2022 17:37
@masterleinad masterleinad force-pushed the distributed_knn_callback branch from 1375f7b to 3f66b82 Compare October 12, 2022 17:02
Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think the PR is in the right direction. So far, I've only looked inside src/. I think we need to fix few things in the proposed implementation (most of my comments are superfluous and can be ignored for now):

  • We need to support pure callbacks
    They would use mostly the same code, except the API would not have offset and output views, and they won't need the extra round of backward communication
  • For non-pure callbacks, we should allow a callback to output variable number of elements, which would necessitate some version of two-pass code.

Comment on lines 59 to 77
struct ArborX::GeometryTraits::tag<ArborX::ExperimentalHyperGeometry::Sphere<DIM>>
struct ArborX::GeometryTraits::tag<
ArborX::ExperimentalHyperGeometry::Sphere<DIM>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an intentional change? I think it should be reverted.

Comment on lines 759 to 776
// Execute the callback on the process owning the primitives.
OutputView remote_out(
Kokkos::view_alloc(space, Kokkos::WithoutInitializing,
"ArborX::DistributedTree::query::remote_out"),
n_imports);
KokkosExt::reallocWithoutInitializing(space, indices, n_imports);
Kokkos::parallel_for(
"ArborX::DistributedTree::query::execute_callbacks",
Kokkos::RangePolicy<ExecutionSpace>(space, 0,
imported_queries_with_indices.size()),
KOKKOS_LAMBDA(int i) {
callback(imported_queries_with_indices(i).query,
imported_queries_with_indices(i).primitive_index,
[&](typename OutputView::value_type const &value) {
remote_out(i) = value;
indices(i) = imported_queries_with_indices(i).query_id;
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong for a few reasons.

First, it assumes that callback will output a single element for each match. In practice, we don't know how many elements the callback will spit out per query. It could be 0, or it could be more than 1. So we need to do a two pass here, similar to the CrsGraphWrapperImpl. I'm not sure what is the easiest way to do it without massive code duplication.

Second, it assumes inline callback and not a post one. We could try to support post callback, but its meaning would have to be changed as we cannot execute it on all of the results coming from different ranks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just assume that the callback only returns a single element to get started, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. The thing is, this condition is undetectable, so the failure for a user would likely be an undefined behavior or a segfault. Which in the distributed setting would be tricky to debug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an assertion.

Comment on lines +787 to +805
typename DeviceType::memory_space memory_space;
Kokkos::View<int *, DeviceType> destinations(
Kokkos::view_alloc(space, Kokkos::WithoutInitializing,
"ArborX::DistributedTree::query::destinations"),
dest.size());
Kokkos::deep_copy(space, destinations, host_destinations);
Kokkos::View<int *, DeviceType> offsets(
Kokkos::view_alloc(space, Kokkos::WithoutInitializing,
"ArborX::DistributedTree::query::offsets"),
off.size());
Kokkos::deep_copy(space, offsets, host_offsets);
auto const n_imports_back =
back_distributor.createFromSends(space, destinations, offsets);
KokkosExt::reallocWithoutInitializing(space, out, n_imports_back);
Kokkos::View<int *, DeviceType> query_ids(
Kokkos::view_alloc(space, Kokkos::WithoutInitializing,
"ArborX::DistributedTree::query::nearest::query_ids"),
n_imports_back);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified by using UnmanagedView from the distributor data pointers, and doing an immediate deep_copoy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what I'm doing already?

@masterleinad
Copy link
Collaborator Author

We need to support pure callbacks
They would use mostly the same code, except the API would not have offset and output views, and they won't need the extra round of backward communication

That's the same as #733, right?

@aprokop
Copy link
Contributor

aprokop commented Dec 16, 2022

That's the same as #733, right?

No, that was for spatial queries. I need to think a bit about that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants