multi GPU DP traning encounter error, how to solve? #974
Unanswered
tonystarkiss
asked this question in
Q&A
Replies: 1 comment 2 replies
-
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
(base) root@bf9e9850e99e:/data/dpmd/deepmd-kit/examples/dpmd_raw/train# horovodrun -np 8 dp train --mpi-log=workers input.json
2021-08-15 12:48:07.708004: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
/bin/bash: /data/deepmd-kit-2.0.0/lib/libtinfo.so.6: no version information available (required by /bin/bash)
[mpiexec@bf9e9850e99e] match_arg (utils/args/args.c:163): unrecognized argument allow-run-as-root
[mpiexec@bf9e9850e99e] HYDU_parse_array (utils/args/args.c:178): argument matching returned error
[mpiexec@bf9e9850e99e] parse_args (ui/mpich/utils.c:1642): error parsing input array
[mpiexec@bf9e9850e99e] HYD_uii_mpx_get_parameters (ui/mpich/utils.c:1694): unable to parse user arguments
[mpiexec@bf9e9850e99e] main (ui/mpich/mpiexec.c:148): error parsing parameters
Beta Was this translation helpful? Give feedback.
All reactions