Skip to content

gmsh: patch for read_parallel and write_parallel with CUDA backend  #405

@cwsmith

Description

@cwsmith

@tristan0x

We were merging the snl repo into the scorec fork and some code in Omega_h_gmsh.cpp caught my attention. It appears that in a few places device arrays are being used in host code. E.g.; vert_globals_w in this block:

Write<GO> vert_globals_w(nnodes);
for (LO local_index = 0; local_index < nnodes; ++local_index) {
const auto global_index =
static_cast<GO>(node_tags[static_cast<std::size_t>(local_index)]);
node_number_map[global_index] = local_index;
vert_globals_w[local_index] = global_index;
}

This branch main...SCOREC:omega_h:cws/gmshFix has a partial fix but fails during the serial vs parallel mesh comparison test here:

OMEGA_H_CHECK(light_compare_meshes(mesh, pmesh) == OMEGA_H_SAME);

Details on the build and failure are below.

Any help would be appreciated.


GCC version: 7.4.0
CUDA version: 11.4
MPICH version: 3.3.1

Gmsh version: 4.11.0 (tag=gmsh_4_11_0) from https://gitlab.onelab.info/gmsh/gmsh.git
Gmsh cmake build command:

d=buildGmsh
cmake -S gmsh -B $d \
  -DCMAKE_INSTALL_PREFIX=$d/install \
  -DENABLE_BUILD_DYNAMIC=on \
  -DBUILD_SHARED_LIBS=ON 
cmake --build $d --target install -j8

Omegah cmake build command:

d=buildOmegah
cmake -S omega_h -B $d \
  -DGmsh_INCLUDE_DIRS=$PWD/buildGmsh/install/include \
  -DGmsh_LIBRARIES=$PWD/buildGmsh/install/lib64/libgmsh.so \
  -DGmsh_VERSION_STRING=4.11.0 \
  -DCMAKE_INSTALL_PREFIX=$d/install \
  -DBUILD_TESTING=on  \
  -DOmega_h_USE_CUDA=on \
  -DOmega_h_CUDA_ARCH=75 \
  -DOmega_h_USE_MPI=on  \
  -DOmega_h_USE_Gmsh=on  \
  -DBUILD_SHARED_LIBS=ON \
  -DBUILD_TESTING=on
cmake --build $d --target install -j8

GDB stack trace

Note, I'm not sure why, but when I run gmsh --version on the command line the output Error : Unknown string option 'General.WebBrowser' appears. This output also appears when using the API (see below). I didn't find anything referencing this output in the gitlab issues page. It appears to be non-fatal as I was able to load the gmsh file, creating the omegah mesh, and writing VTK files (before the failing mesh comparison).

gdb ./src/unit_io
─── Output/messages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Error   : Unknown string option 'General.WebBrowser'

Thread 1 "unit_io" received signal SIGSEGV, Segmentation fault.
0x00007ffff3790220 in __memcmp_sse4_1 () from /lib64/libc.so.6
─── Source ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Stack ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[0] from 0x00007ffff3790220 in __memcmp_sse4_1
[1] from 0x00000000004201a7 in std::__equal<true>::equal<long>(long const*, long const*, long const*)+80 at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:814
[2] from 0x000000000041ee73 in std::__equal_aux<long const*, long const*>(long const*, long const*, long const*)+47 at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:831
[3] from 0x000000000041de87 in std::equal<long const*, long const*>(long const*, long const*, long const*)+79 at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:1051
[4] from 0x000000000041a0fd in light_compare_meshes(Omega_h::Mesh&, Omega_h::Mesh&)+978 at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1271
[5] from 0x000000000041ae46 in test_gmsh_parallel(Omega_h::Library*)+675 at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1358
[6] from 0x000000000041cb0d in main(int, char**)+317 at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1485
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>>> where
#0  0x00007ffff3790220 in __memcmp_sse4_1 () from /lib64/libc.so.6
#1  0x00000000004201a7 in std::__equal<true>::equal<long> (__first1=0x7fffc4410a00, __last1=0x7fffc4413708, __first2=0x7fffc4424200) at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:814
#2  0x000000000041ee73 in std::__equal_aux<long const*, long const*> (__first1=0x7fffc4410a00, __last1=0x7fffc4413708, __first2=0x7fffc4424200) at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:831
#3  0x000000000041de87 in std::equal<long const*, long const*> (__first1=0x7fffc4410a00, __last1=0x7fffc4413708, __first2=0x7fffc4424200) at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:1051
#4  0x000000000041a0fd in light_compare_meshes (a=..., b=...) at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1271
#5  0x000000000041ae46 in test_gmsh_parallel (lib=0x7fffffffac30) at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1358
#6  0x000000000041cb0d in main (argc=1, argv=0x7fffffffade8) at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1485

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions