Replies: 2 comments 3 replies
-
All data generated by dpgen is used for training. One should prepare the test dataset by his/herself. |
Beta Was this translation helpful? Give feedback.
1 reply
-
For the meaning of the parameters, you can refer to https://docs.deepmodeling.com/projects/dpgen/en/devel/run/index.html. Besides, here is a possible structure of a single dataset. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi All,
There is always no test_err in the lcurve.out of my dpgen-train since using dpgen.
I guess it has tested every disp_freq but no results output in lcurve.out.
So, I wonder if my results are right and how to abtain the test_err in lcurve.out.
as shown below,
(base) [********]$ head -n 2 lcurve.out && tail -n 2 lcurve.out
step rmse_trn rmse_e_trn rmse_f_trn rmse_v_trn lr
0 6.12e+01 1.99e+00 1.93e+00 2.31e-01 1.0e-03
499000 5.46e-01 3.57e-02 3.31e-01 3.61e-02 3.7e-08
500000 4.35e-01 1.45e-02 9.97e-02 2.27e-02 3.5e-08
but, I found a normal lcurve.out recently, https://zhuanlan.zhihu.com/p/555628454 as shown below,
cat iter.000000/00.train/000/
head -n 2 lcurve.out && tail -n 2 lcurve.out
batch l2_tst l2_trn l2_e_tst l2_e_trn l2_f_tst l2_f_trn lr
0 8.14e+00 8.00e+00 1.00e+01 1.00e+01 4.78e-02 3.41e-03 1.0e-03
398000 6.47e-03 7.17e-03 3.47e-06 1.81e-06 6.30e-03 6.98e-03 5.3e-08
400000 6.46e-03 7.74e-03 2.85e-06 1.36e-06 6.30e-03 7.55e-03 5.0e-08
batch 的最终数值 param.json 中 stop_batch 的指定值。
So, I wonder if my results are right. I guess it has tested every disp_freq
DEEPMD INFO initialize model from scratch
DEEPMD INFO start training at lr 1.00e-03 (== 1.00e-03), decay_step 2500, decay_rate 0.950006, final lr will be 3.51e-08
DEEPMD INFO batch 100 training time 9.35 s, testing time 0.09 s
DEEPMD INFO batch 200 training time 9.14 s, testing time 0.07 s
DEEPMD INFO batch 300 training time 9.20 s, testing time 0.10 s
DEEPMD INFO batch 400 training time 8.95 s, testing time 0.06 s
DEEPMD INFO batch 500 training time 8.41 s, testing time 0.03 s
and this is my param.json, I set the "numb_test": 1,
"training": {
"stop_batch": 500000,
"disp_file": "lcurve.out",
"disp_freq": 100,
"numb_test": 1,
"save_freq": 10000,
"save_ckpt": "model.ckpt",
"disp_training": true,
"time_training": true,
"profiling": false,
"profiling_file": "timeline.json",
"_comment": "that's all",
best,
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions