-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
bugSomething isn't workingSomething isn't workingquestionFurther information is requestedFurther information is requested
Description
With attached files, the experiments with LPF do not work:
- with lpf_hook, it seems like the peer list in src/MPI/ibverbs.cpp is empty and no communication is scheduled at all (lpf_put translates to NOP) - example synthetic4.cpp below
- with lpf_exec, an error happens at runtime with the synthetic3.cpp:
build/_deps/lpf-src/build/bin/lpfrun -engine ibverbs -n 2 build/synthetic3 64 100
LPF Backend Error! lpf_exec( LPF_ROOT, LPF_MAX_P, &spmd, args)[srv01:716807] *** Process received signal ***
[srv01:716807] Signal: Aborted (6)
[srv01:716807] Signal code: (-6)
[srv01:716807] [ 0] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000304ba7dc]
[srv01:716807] [ 1] /lib/aarch64-linux-gnu/libc.so.6(+0x7f1f0)[0x40003096f1f0]
[srv01:716807] [ 2] /lib/aarch64-linux-gnu/libc.so.6(raise+0x1c)[0x40003092a67c]
[srv01:716807] [ 3] /lib/aarch64-linux-gnu/libc.so.6(abort+0xe4)[0x400030917130]
[srv01:716807] [ 4] build/synthetic3(main+0x130)[0xaaaabdad7ab0]
[srv01:716807] [ 5] /lib/aarch64-linux-gnu/libc.so.6(+0x273fc)[0x4000309173fc]
[srv01:716807] [ 6] /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0x98)[0x4000309174cc]
[srv01:716807] [ 7] build/synthetic3(_start+0x30)[0xaaaabdad6d30]
[srv01:716807] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node srv01 exited on signal 6 (Aborted).
Compilation as follows:
lpfcxx -engine ibverbs synthetic3.cpp -o synthetic3
Execution as follows:
lpfrun -engine ibverbs synthetic3 64 100
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingquestionFurther information is requestedFurther information is requested