Disappeared training epoch and loss output #2275
Unanswered
AnatoleWang
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Normally when the C3D is trained, the output should be as follows:
2023-02-09 18:18:30,215 - mmaction - INFO - workflow: [('train', 1)], max: 45 epochs
2023-02-09 18:19:20,858 - mmaction - INFO - Epoch [1][20/20] lr: 1.000e-03, eta: 0:37:08, time: 2.532, data_time: 1.908, memory: 5750, top1_acc: 0.6038, top5_acc: 0.7679, loss_cls: 1.8912, loss: 1.8912, grad_norm: 32.0169
2023-02-09 18:20:05,393 - mmaction - INFO - Epoch [2][20/20] lr: 1.000e-03, eta: 0:33:53, time: 2.198, data_time: 1.860, memory: 5750, top1_acc: 0.8935, top5_acc: 1.0000, loss_cls: 0.2705, loss: 0.2705, grad_norm: 19.0957
2023-02-09 18:20:49,067 - mmaction - INFO - Epoch [3][20/20] lr: 1.000e-03, eta: 0:32:06, time: 2.153, data_time: 1.819, memory: 5750, top1_acc: 0.9773, top5_acc: 1.0000, loss_cls: 0.0732, loss: 0.0732, grad_norm: 8.1325
2023-02-09 18:21:31,662 - mmaction - INFO - Epoch [4][20/20] lr: 1.000e-03, eta: 0:30:41, time: 2.099, data_time: 1.766, memory: 5750, top1_acc: 0.9616, top5_acc: 1.0000, loss_cls: 0.1309, loss: 0.1309, grad_norm: 10.0417
2023-02-09 18:22:14,109 - mmaction - INFO - Epoch [5][20/20] lr: 1.000e-03, eta: 0:29:31, time: 2.093, data_time: 1.761, memory: 5750, top1_acc: 0.9913, top5_acc: 1.0000, loss_cls: 0.0296, loss: 0.0296, grad_norm: 3.8235
2023-02-09 18:22:14,723 - mmaction - INFO - Saving checkpoint at 5 epochs
2023-02-09 18:22:49,564 - mmaction - INFO - Evaluating top_k_accuracy ...
2023-02-09 18:22:49,573 - mmaction - INFO -
top1_acc 0.9469
top5_acc 1.0000
2023-02-09 18:22:49,574 - mmaction - INFO - Evaluating mean_class_accuracy ...
2023-02-09 18:22:49,577 - mmaction - INFO -
mean_acc 0.9517
2023-02-09 18:22:52,115 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_5.pth.
2023-02-09 18:22:52,116 - mmaction - INFO - Best top1_acc is 0.9469 at 5 epoch.
2023-02-09 18:22:52,117 - mmaction - INFO - Epoch(val) [5][8] top1_acc: 0.9469, top5_acc: 1.0000, mean_class_accuracy: 0.9517
However, when I change some parameters of the model, the output doesn't contain the Epoch(train) and loss:
2023-03-07 20:52:57,611 - mmaction - INFO - workflow: [('train', 1)], max: 45 epochs
2023-03-07 20:55:50,501 - mmaction - INFO - Saving checkpoint at 5 epochs
2023-03-07 20:56:08,759 - mmaction - INFO - Evaluating top_k_accuracy ...
2023-03-07 20:56:08,761 - mmaction - INFO -
top1_acc 0.7361
top5_acc 0.9583
2023-03-07 20:56:08,762 - mmaction - INFO - Evaluating mean_class_accuracy ...
2023-03-07 20:56:08,764 - mmaction - INFO -
mean_acc 0.7286
2023-03-07 20:56:11,720 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_5.pth.
2023-03-07 20:56:11,722 - mmaction - INFO - Best top1_acc is 0.7361 at 5 epoch.
2023-03-07 20:56:11,722 - mmaction - INFO - Epoch(val) [5][3] top1_acc: 0.7361, top5_acc: 0.9583, mean_class_accuracy: 0.7286
When I change workflow = [('train', 1),('val',1)] in c3d_sports1m_16x1x1_45e_ucf101_rgb.py, and the command:
python tools/train.py configs/recognition/c3d/c3d_sports1m_16x1x1_45e_ucf101_rgb.py --validate --gpus 1 --seed 0 --deterministic --cfg-options load_from=checkpoints/c3d_sports1m_pretrain_20201016-dcc47ddc.pth
The loss shows up again, but not the Epoch(train):
2023-03-08 16:06:12,981 - mmaction - INFO - workflow: [('train', 1), ('val', 1)], max: 45 epochs
2023-03-08 16:07:01,192 - mmaction - INFO - Epoch(val) [1][3] top1_acc: 0.1528, top5_acc: 0.4861, loss_cls: 4.1278, loss: 4.1278
2023-03-08 16:07:47,966 - mmaction - INFO - Epoch(val) [2][3] top1_acc: 0.4583, top5_acc: 0.8750, loss_cls: 3.0645, loss: 3.0645
2023-03-08 16:08:32,920 - mmaction - INFO - Epoch(val) [3][3] top1_acc: 0.6111, top5_acc: 0.9306, loss_cls: 1.7323, loss: 1.7323
2023-03-08 16:09:19,538 - mmaction - INFO - Epoch(val) [4][3] top1_acc: 0.7083, top5_acc: 0.9444, loss_cls: 1.0702, loss: 1.0702
2023-03-08 16:09:48,785 - mmaction - INFO - Saving checkpoint at 5 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 72/72, 5.8 task/s, elapsed: 12s, ETA: 0s2023-03-08 16:10:04,643 - mmaction - INFO - Evaluating top_k_accuracy ...
2023-03-08 16:10:04,646 - mmaction - INFO -
top1_acc 0.7639
top5_acc 0.9444
2023-03-08 16:10:04,648 - mmaction - INFO - Evaluating mean_class_accuracy ...
2023-03-08 16:10:04,650 - mmaction - INFO -
mean_acc 0.7571
2023-03-08 16:10:07,840 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_5.pth.
2023-03-08 16:10:07,841 - mmaction - INFO - Best top1_acc is 0.7639 at 5 epoch.
2023-03-08 16:10:07,841 - mmaction - INFO - Epoch(val) [5][3] top1_acc: 0.7639, top5_acc: 0.9444, mean_class_accuracy: 0.7571
Does anyone know where the real problem is? Maybe the --validate or other things? Thanks a lot for your reply!
Beta Was this translation helpful? Give feedback.
All reactions