-
|
I've worked with 2022-07-29 12:12:03.885056: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:460] ptxas returned an error during compilation of ptx to sass: 'INTERNAL: ptxas exited with non-zero error code 65280, output: ptxas fatal : Could not open output file '/var/tmp/pbs.313487/tmpxft_000021e4_0000000a'
' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.
Fatal Python error: Aborted
2022-07-29 12:13:50.890606: F external/org_tensorflow/tensorflow/compiler/xla/pjrt/distributed/client.h:71] Terminating process because the coordinator detected missing heartbeats. This most likely indicates that another task died; see the other task logs for more details. Status: ABORTED: Shutting down due to missed heartbeat from task 1
Fatal Python error: AbortedDoes anyone know what could be causing the error? Is there any potential way I can resolve it? Or is it just a permissions issue? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
The first error related to For the |
Beta Was this translation helpful? Give feedback.
The first error related to
ptxasis the one you need to look at. The second error is just a consequence of the first one: since one of your processes died with theptxaserror, the others will eventually notice and shut themselves down also.For the
ptxaserror, is the guess in the error message correct? Do you have enough space on that filesystem?