Skip to content

Conversation

@Petter-Programs
Copy link

Add a barrier in cases where reading memory directly from neighbors to fix an issue where the send buffer is overwritten prematurely in coll_han_alltoall.c. Please see the provided reproducer below to understand the issue a bit more. The problem is that alltoall returns before it is safe to touch the send buffer.

Bug discovered and patched at AccelCom, Barcelona Supercomputing Center.

Here is the reproducer:

#include <mpi.h>
#include <assert.h>
#include <stdio.h>

#define MPI_PROCESSES 4

#define MAX_ITERATIONS 100000

#define SPOOKY_NUMBER 9999

int main(int argc, char *argv[])
{
    MPI_Init(&argc, &argv);

    int real_world_size, my_rank;

    MPI_Comm_size(MPI_COMM_WORLD, &real_world_size);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    if(real_world_size != MPI_PROCESSES)
    {
        if(my_rank == 0)
            printf("Expected exactly %d processes.\n", MPI_PROCESSES);
        
            return 1;
    }

    int send_buff[MPI_PROCESSES];
    int recv_buff[MPI_PROCESSES];

    // Fill send_buff with ones, recv_buff with zeroes
    for(int i = 0; i<MPI_PROCESSES; i++)
    {
        send_buff[i] = 1;
        recv_buff[i] = 0;
    }

    for(int it = 0; it<MAX_ITERATIONS; it++)
    {
        MPI_Alltoall(send_buff, 1, MPI_INT, recv_buff, 1, MPI_INT, MPI_COMM_WORLD);

        // Do something else with the send buffer
        for(int i = 0; i< MPI_PROCESSES; i++)
        {
            send_buff[i] = SPOOKY_NUMBER;
        }
        
        // This should not have affected the receive buffer... right?
        for(int i = 0; i< MPI_PROCESSES; i++)
        {
            assert(recv_buff[i] != SPOOKY_NUMBER);
        }

        // Reset the send buffer back for the next iteration
        for(int i = 0; i< MPI_PROCESSES; i++)
        {
            send_buff[i] = 1;
        }

    }

    printf("Rank %d: successfully completed %d iterations!\n", my_rank, MAX_ITERATIONS);

    MPI_Finalize();
}

To run, use this:

mpirun -np 4 host_one:2,host_two:2 ./program

For me, the ssh launcher was used to do this and it automatically selected the coll_han_alltoall module.

Add a barrier in cases where reading memory directly from neighbors
to fix an issue where the send buffer is overwritten prematurely in
coll_han_alltoall.c

Bug discovered and patched at AccelCom, Barcelona Supercomputing Center

Signed-off-by: Petter Sandas <[email protected]>
@lrbison
Copy link
Contributor

lrbison commented Mar 3, 2025

Thank you for finding this issue and submitting the fix. This looks like the right fix to me.

The problem does not exist for han alltoallv, because we have a trailing barrier there: https://github.com/open-mpi/ompi/blob/main/ompi/mca/coll/han/coll_han_alltoallv.c#L956

@lrbison lrbison self-assigned this Mar 3, 2025
@lrbison
Copy link
Contributor

lrbison commented Mar 6, 2025

Tested and confirmed both the issue and the fix. Thanks again for finding this!

@lrbison lrbison merged commit 3f12346 into open-mpi:main Mar 6, 2025
15 checks passed
@lrbison
Copy link
Contributor

lrbison commented Mar 6, 2025

No need for backport, as the algorithm isn't in v5.0.x branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants