-
Notifications
You must be signed in to change notification settings - Fork 929
Description
I'm using MPI_Issend in one of my code paths to ensure the proper reception of messages before signalling that all outstanding transfers have completed. At some point I came across the communicator info key mpi_assert_allow_overtaking that is available in Open MPI 4.0.1 (described in §6.4.4 of the current MPI standard draft) and thought I'd give it a try because I really don't care about message ordering in this particular code. Well, premature optimizations are the root of all evil... It took me quite a while to figure out that having added that key a while ago actually broke this code path because it causes transfers started with MPI_Issend to never complete.
The below code can be used to reliable trigger this issue:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
int rank, size;
int nmsg = 0;
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm comm;
MPI_Comm_dup(MPI_COMM_WORLD, &comm);
// signal MPI that we don't care about the order of messages
MPI_Info info;
MPI_Info_create(&info);
// Setting this key causes Issend transfers to never complete
MPI_Info_set(info, "mpi_assert_allow_overtaking", "true");
MPI_Comm_set_info(comm, info);
MPI_Info_free(&info);
int dest = (rank + 1) % size;
int val;
MPI_Request rreq, sreq;
MPI_Irecv(&val, 1, MPI_INT, MPI_ANY_SOURCE, 1000, comm, &rreq);
MPI_Issend(&val, 1, MPI_INT, dest, 1000, comm, &sreq);
int sflag = 0, rflag = 0;
printf("Starting testing\n");
do {
if (!rflag)
MPI_Test(&rreq, &rflag, MPI_STATUS_IGNORE);
if (!sflag)
MPI_Test(&sreq, &sflag, MPI_STATUS_IGNORE);
} while (!sflag && !rflag);
printf("Done with single message!\n");
MPI_Finalize();
return 0;
}The code works if I
- Change the
MPI_IssendtoMPI_Isend(which is not correct in my case); or - Avoid setting the key
mpi_assert_allow_overtakingtotrue
Otherwise, the code continues testing the send and receive requests without ever completing the message:
$ mpirun -n 2 -N 1 ./test_mpiissend
Starting testing
Starting testing
I'm using Open MPI 4.0.1 (installed from release tarball) and see this problem on both a Cray XC40 and an IB cluster (tested with and without UCX).
I guess this info key is still an experimental feature since it's not yet part of the official standard. My understanding of this info key is that it changes the ordering in which messages are matched but that should not interfere with the way MPI_Issend works, right?
Please let me know if I can provide any other information.