- 
                Notifications
    
You must be signed in to change notification settings  - Fork 929
 
Closed
Description
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
mpirun --version reports 4.0.0
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
I run Linux mint 21.3 v6.0.4 and use the operating system apt install libopenmpi-dev
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
Please describe the system on which you are running
- Operating system/version: Linux mint 21.3 v6.0.4
 - Computer hardware: Intel Xeon W-2133 CPU
 - Network type: n-a
mpic++ --version is g++ 11.4.0 
Details of the problem
This program segfaults
 //segfault.cpp          
                                                                                                                                                       
#include <iostream>
#include <cassert>
#include <mpi.h>                                                           
                 
int main( int argc, char* argv[] )
{                                 
    MPI_Init(&argc, &argv);       
                                  
    int rank, size;        
    MPI_Comm_rank( MPI_COMM_WORLD, &rank);                                
    MPI_Comm_size( MPI_COMM_WORLD, &size);
                                          
    MPI_Comm comm =MPI_COMM_WORLD;        
    int reduce_rank = rank % 2;                                                 
    int color = reduce_rank == 0 ? 1 : MPI_UNDEFINED;                              
    MPI_Comm comm_split;                                                                                           
    MPI_Comm_split( comm, color, 0, &comm_split);                                                                  
                                                                                                                   
    MPI_Comm comm2;                                                                                                
    int dims[2] = {0,0};                                                                                           
    int periods[2] = {1,1};                                                                                        
    assert( MPI_Dims_create( size, 2, dims) == MPI_SUCCESS);                                                       
    assert( MPI_Cart_create( MPI_COMM_WORLD, 2, dims, periods, true , &comm2) == MPI_SUCCESS);                     
    int remains[2] = {1,0};                                                                                        
    MPI_Comm comm2_01;                                                                                             
    assert( comm2 != MPI_COMM_NULL);                                                                               
    assert( MPI_Cart_sub( comm2, remains, &comm2_01) == MPI_SUCCESS);        // Segfault here                                      
                                                                                                                   
    MPI_Finalize();                                                                                                
}  shell$ mpic++ segfault.cpp -o segfault -g
shell$ mpirun -n 2 ./segfault
[titanxp:09308] *** Process received signal ***
[titanxp:09308] Signal: Segmentation fault (11)
[titanxp:09308] Signal code: Address not mapped (1)
[titanxp:09308] Failing at address: 0x8
[titanxp:09308] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fa3ee5ed520]
[titanxp:09308] [ 1] /usr/local/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_match+0x995)[0x7fa3cac121d5]
[titanxp:09308] [ 2] /usr/local/lib/openmpi/mca_btl_smcuda.so(mca_btl_smcuda_component_progress+0x324)[0x7fa3d9406af4]
[titanxp:09308] [ 3] /usr/local/lib/libopen-pal.so.40(opal_progress+0x2c)[0x7fa3ede36c2c]
[titanxp:09308] [ 4] /usr/local/lib/libmpi.so.40(ompi_request_default_wait+0x4d)[0x7fa3eea4b95d]
[titanxp:09308] [ 5] /usr/local/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0xc1)[0x7fa3eeaa85d1]
[titanxp:09308] [ 6] /usr/local/lib/libmpi.so.40(ompi_coll_base_allgather_intra_two_procs+0x89)[0x7fa3eeaa77c9]
[titanxp:09308] [ 7] /usr/local/lib/libmpi.so.40(ompi_comm_split+0xc5)[0x7fa3eea2eba5]
[titanxp:09308] [ 8] /usr/local/lib/libmpi.so.40(mca_topo_base_cart_sub+0xe4)[0x7fa3eead0054]
[titanxp:09308] [ 9] /usr/local/lib/libmpi.so.40(PMPI_Cart_sub+0xca)[0x7fa3eea6805a]
[titanxp:09308] [10] ./segfault(+0x1469)[0x560e774d9469]
[titanxp:09308] [11] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fa3ee5d4d90]
[titanxp:09308] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fa3ee5d4e40]
[titanxp:09308] [13] ./segfault(+0x11e5)[0x560e774d91e5]
[titanxp:09308] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node titanxp exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Metadata
Metadata
Assignees
Labels
No labels