Skip to content

seg fault runnig QE with Environ #15

@jerrypgreenberg

Description

@jerrypgreenberg

Hi,

I attemped to build QE with Environ for a user on one of our clusters at SDSC (www.sdsc.edu). The cluster ("expanse") has AMD (Rome) processors and is running Rocky 8.

My build environment was gcc/10.2.0, openmpi/4.1.3 and intel-mkl/2020.4.304 for both Environ and QE (which I built with cmake).

I buit Environ with configure, then I built QE with a reference to the Environ build:

cmake -DCMAKE_INSTALL_PREFIX:STRING=/home/jsun7/qe/qe-7.3.1 -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo -DCMAKE_INTERPROCEDURAL_OPTIMIZATION:BOOL=OFF -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=OFF -DQE_ENABLE_MPI:BOOL=ON -DQE_ENABLE_OPENMP:BOOL=OFF -DQE_ENABLE_SCALAPACK:BOOL=ON -DQE_ENABLE_ELPA:BOOL=OFF -DQE_ENABLE_LIBXC:BOOL=OFF -DCMAKE_C_COMPILER:STRING=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen2/gcc-10.2.0/openmpi-4.1.3-oq3qvsvt5mywjzy7xzrfeh6eebiujvbm/bin/mpicc -DCMAKE_Fortran_COMPILER:STRING=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen2/gcc-10.2.0/openmpi-4.1.3-oq3qvsvt5mywjzy7xzrfeh6eebiujvbm/bin/mpif90 -DENVIRON_ROOT=/home/jsun7/qe/qe-7.3.1/Environ ..

For any input I get the same consistent error.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x15554b5fcb4f in ???
#1 0xa95431 in __class_iontype_MOD_set_iontype_id
at /home/jsun7/qe/qe-7.3.1/Environ/src/iontype.f90:572
#2 0xa94f1c in __class_iontype_MOD_init_environ_iontype
at /home/jsun7/qe/qe-7.3.1/Environ/src/iontype.f90:214
#3 0xa82516 in __class_ions_MOD_init_environ_ions
at /home/jsun7/qe/qe-7.3.1/Environ/src/ions.f90:245
#4 0xa88e58 in __class_environ_MOD_environ_init_physical
at /home/jsun7/qe/qe-7.3.1/Environ/src/main.f90:978
#5 0xa88cf5 in __class_environ_MOD_init_environ_base
at /home/jsun7/qe/qe-7.3.1/Environ/src/main.f90:234
#6 0x6aef1e in environ_base_module_MOD_init_environ_base
at /home/jsun7/qe/qe-7.3.1/Modules/environ_base_module.f90:122
#7 0x4580ad in init_run

at /home/jsun7/qe/qe-7.3.1/PW/src/init_run.f90:137
#8 0x4d48f1 in run_pwscf

at /home/jsun7/qe/qe-7.3.1/PW/src/run_pwscf.f90:160
#9 0x407cab in pwscf
at /home/jsun7/qe/qe-7.3.1/PW/src/pwscf.f90:85
#10 0x4079ec in main
at /home/jsun7/qe/qe-7.3.1/PW/src/pwscf.f90:40

Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.


mpirun noticed that process rank 0 with PID 398074 on node exp-9-56 exited on signal 11 (Segmentation fault).

~

Thanks,

Jerry

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions