Skip to content

Issue: Bug/Performance Issue [Replication] #115

@JohnsonQi

Description

@JohnsonQi

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Ubuntu 16.04
  • Python version:3.7
  • Installed using pip or ROS:pip
  • GPU model (if applicable): Geforce1080Ti
    CPU 24G
    @visatish

Describe the result you are trying to replicate

04-21 18:00:03 GQCNNTrainerTF INFO Step 51878 (epoch 1.234), 0.01 s
04-21 18:00:03 GQCNNTrainerTF INFO Minibatch loss: 0.379, learning rate: 0.0095
04-21 18:00:03 GQCNNTrainerTF INFO Minibatch error: 8.594
04-21 18:00:03 GQCNNTrainerTF INFO Step took 0.556 sec.
04-21 18:00:03 GQCNNTrainerTF INFO Max 0.7454133
04-21 18:00:03 GQCNNTrainerTF INFO Min 0.0010097245
04-21 18:00:03 GQCNNTrainerTF INFO Pred nonzero 25
04-21 18:00:03 GQCNNTrainerTF INFO True nonzero 31
04-21 18:00:03 GQCNNTrainerTF INFO Step 51879 (epoch 1.234), 0.01 s
04-21 18:00:03 GQCNNTrainerTF INFO Minibatch loss: 0.509, learning rate: 0.0095
04-21 18:00:03 GQCNNTrainerTF INFO Minibatch error: 28.125
04-21 18:00:03 GQCNNTrainerTF INFO Step took 0.054 sec.
04-21 18:00:03 GQCNNTrainerTF INFO Max 0.28531963
04-21 18:00:03 GQCNNTrainerTF INFO Min 1.0366358e-05
04-21 18:00:03 GQCNNTrainerTF INFO Pred nonzero 0
04-21 18:00:03 GQCNNTrainerTF INFO True nonzero 0
04-21 18:00:03 GQCNNTrainerTF INFO Step 51880 (epoch 1.234), 0.0 s
04-21 18:00:03 GQCNNTrainerTF INFO Minibatch loss: 0.179, learning rate: 0.0095
04-21 18:00:03 GQCNNTrainerTF INFO Minibatch error: 0.0
04-21 18:00:04 GQCNNTrainerTF INFO Step took 0.101 sec.
04-21 18:00:04 GQCNNTrainerTF INFO Max 0.6219147
04-21 18:00:04 GQCNNTrainerTF INFO Min 2.1433334e-05
04-21 18:00:04 GQCNNTrainerTF INFO Pred nonzero 18
04-21 18:00:04 GQCNNTrainerTF INFO True nonzero 42
04-21 18:00:04 GQCNNTrainerTF INFO Step 51881 (epoch 1.234), 0.0 s
04-21 18:00:04 GQCNNTrainerTF INFO Minibatch loss: 0.503, learning rate: 0.0095
04-21 18:00:04 GQCNNTrainerTF INFO Minibatch error: 31.25
04-21 18:29:39 GQCNNTrainerTF INFO Cleaning and preparing to exit optimization...
04-21 18:29:41 GQCNNTrainerTF INFO Terminating prefetch queue workers...
04-21 18:29:49 GQCNNTrainerTF INFO Flushing prefetch queue...

Describe the unexpected behavior
As you can see in the log, the training shut down unexpectedly. I tried three times and this situation always happens. How can I fix this? Maybe it's something about CPU overflow, but I think my CPU is big enough.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions