Skip to content

mpi_assert_allow_overtaking breaks MPI_Issend #6559

@devreal

Description

@devreal

I'm using MPI_Issend in one of my code paths to ensure the proper reception of messages before signalling that all outstanding transfers have completed. At some point I came across the communicator info key mpi_assert_allow_overtaking that is available in Open MPI 4.0.1 (described in §6.4.4 of the current MPI standard draft) and thought I'd give it a try because I really don't care about message ordering in this particular code. Well, premature optimizations are the root of all evil... It took me quite a while to figure out that having added that key a while ago actually broke this code path because it causes transfers started with MPI_Issend to never complete.

The below code can be used to reliable trigger this issue:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
  int rank, size;
  int nmsg = 0;
  int provided;
  MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  MPI_Comm comm;
  MPI_Comm_dup(MPI_COMM_WORLD, &comm);

  // signal MPI that we don't care about the order of messages
  MPI_Info info;
  MPI_Info_create(&info);
  // Setting this key causes Issend transfers to never complete
  MPI_Info_set(info, "mpi_assert_allow_overtaking", "true");
  MPI_Comm_set_info(comm, info);
  MPI_Info_free(&info);

  int dest = (rank + 1) % size;
  int val;
  MPI_Request rreq, sreq;
  MPI_Irecv(&val, 1, MPI_INT, MPI_ANY_SOURCE, 1000, comm, &rreq);
  MPI_Issend(&val, 1, MPI_INT, dest, 1000, comm, &sreq);

  int sflag = 0, rflag = 0;
  printf("Starting testing\n");
  do {
    if (!rflag)
      MPI_Test(&rreq, &rflag, MPI_STATUS_IGNORE);
    if (!sflag)
      MPI_Test(&sreq, &sflag, MPI_STATUS_IGNORE);
  } while (!sflag && !rflag);
  printf("Done with single message!\n");

  MPI_Finalize();
  return 0;
}

The code works if I

  • Change the MPI_Issend to MPI_Isend (which is not correct in my case); or
  • Avoid setting the key mpi_assert_allow_overtaking to true

Otherwise, the code continues testing the send and receive requests without ever completing the message:

$ mpirun -n 2 -N 1 ./test_mpiissend
Starting testing
Starting testing

I'm using Open MPI 4.0.1 (installed from release tarball) and see this problem on both a Cray XC40 and an IB cluster (tested with and without UCX).

I guess this info key is still an experimental feature since it's not yet part of the official standard. My understanding of this info key is that it changes the ordering in which messages are matched but that should not interfere with the way MPI_Issend works, right?

Please let me know if I can provide any other information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions