How to choose the ratio of stop_batch/decay_steps to have a better predictbility of potential. Using 5~10 seems to be better than 200 which was mentioned in DP video tutorial #1805

roger13231 · 2025-09-03T01:05:59Z

roger13231
Sep 3, 2025

Dear maintainer,

I found that using stop_batch=400k and decay_steps of 80k (so ratio will be 5) is "better" than using decay_steps of 0.2k (ratio is 200, which is recommended by a DeepModelling developer). I attached the figures of accurate ratio using different ratio. I will explain in detail about my test and this "better", also at the end, I have several questions about how to tune the parameter.

First of all, my system is alloy and oxides, to study the oxidation behaviors. Previous initial dataset is bulk alloys, now I introduce oxides. During training the oxide data, I found that among all my introduced oxide (Cr2O3, MoO3, etc), the others are good (using [lo, hi] = [0.10, 0.25]), which means the accurate ratio improve gradually. However, MoO3's accurate was under 30% for iter.0-5, and then was around 80% using [lo, hi]=[0.15,0.30] for iter.5-7 (also conducted 8 and 9, which is also around 80%. The previous these iters are covered by new 8 and 9). It seems like it is not easy for MoO3 to converge to a accuracy of 0.15, which I think is not a very high accuracy under temperature range of 50-400K.
In this way, I tried to use some other parameter to train the potential. I use a larger decay_steps of 80k instead of 0.2k, so maybe learning rate is not reduce too fast and model can learn more (I guess). Using new potential, in the iter.10, MoO3 reach a 99.62% accuracy, which I think the potential is "better" than other old potential (?). The new potential also reach accurate for other four oxides (not list all).

My question is:

Using ratio of stop_batch/decay_steps of 200, from my perspective, the learning rate decay very frequently and converge quickly. However, the learning rate reduce too quick, maybe the potential cannot learn very well (I am not sure that whether I understand these parameter right)? In this way, maybe we can use a lower ratio like 5-20? Also, I am wonder how this 200 origin from. I assume it is from pervious experience.
I am wondering whether the way I judge a potential is "better" is right. I assume a potential can show a higher accurate ratio for the dataset means the potential is good.
I am wondering whether users should test this ratio of stop_batch/decay_steps or other training parameters before the training or when the iteration accurate ratio has issues. And whether users should use the tuned/tested parameter (based on point2) for their own DPGEN iteration or system.

Thank you so much for your reply in advance. It gonna be very helpful.

wujing81 · 2025-09-03T06:57:49Z

wujing81
Sep 3, 2025

May I ask if you conducted parallel testing for the iter10 model? Could you compare the training and validation accuracy, as well as the model_devi, between the model with stop_batch/decay_steps=200 and your parameter settings? Also, does the 'Length' in your table refer to the MD simulation time, and is the unit fs or ps?

0 replies

roger13231 · 2025-09-04T05:39:56Z

roger13231
Sep 4, 2025
Author

Hi, thank you so much for your reply,
I compared the training RMSE. For ratio of stop_batch/decay_steps=200, the force RMSE for three structures of MoO3 are around 0.128 eV/A, and for energy RMSE is 4.48 meV/atoms. For ratio of 5, the force is 0.088, and energy is 12.23. I have to see the loss parameter for energy and force are the same. Seems like force is better for 200, although energy is worse.

For the model-devi. Using ratio of 5, For system6, 0.97LatticeConstant MoO3, is 98.81%; for system 7, 1.00, is 98.17%; for system 8, 1.03, is 98.02% (I retrain a new round with ratio of 5 with the same loss parameters, so a little bit difference with figure). I didn't test the ratio of 200, but according to the iter.8, and 9, also the force RMSE, I assume it will be around 80% accurate.

The length is fs. From iter 5-11, it should be 25 (25000 nsteps, dt0.001), 40, 50, 60, 70, 80, 100 fs. Sorry that I mistakely divided by 10.

3 replies

roger13231 Sep 4, 2025
Author

also, not only the ratio I use 5, but also I change the stop_lr to 1e-05

wujing81 Sep 5, 2025

Could you please share the training input files, including the versions with stop_batch/decay_steps ratios of 200 and 5?

roger13231 Sep 5, 2025
Author

input_ratio5.json
input_ratio200.json
Hi, i attached my input file

wujing81 · 2025-09-05T09:58:28Z

wujing81
Sep 5, 2025

Hi, I checked your input files. In input_ratio5.json, the stop_lr is set to 1e-05, whereas in input_ratio200.json it is not specified, so DeePMD-kit defaults to 1e-08. This difference in the stopping learning rate accounts for the accuracy gap between the two models. With stop_lr set to 1e-05, the model maintains a higher learning rate, which can accelerate convergence during the DP-GEN iterative data generation process.

1 reply

roger13231 Sep 8, 2025
Author

Thank you for your reply, I get this.

I did a new training with ratio5 now. I use the same input as ratio200, only decay_steps change to 80,000, and use defaults for stop_lr. Now the energy RMSE is 5.18 meV/atom, force RMSE is 0.122 eV/A. The model-devi is average of 98.33%, which is slightly but not much different with stop_lr=1e-8 which has average of 99.75% shown in the initial photo. I think this is still good compared with ratio200 which possibly have a model-deive of around 80-85%. I am wondering maybe I need to tune this decay_steps?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to choose the ratio of stop_batch/decay_steps to have a better predictbility of potential. Using 5~10 seems to be better than 200 which was mentioned in DP video tutorial #1805

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to choose the ratio of stop_batch/decay_steps to have a better predictbility of potential. Using 5~10 seems to be better than 200 which was mentioned in DP video tutorial #1805

Uh oh!

roger13231 Sep 3, 2025

Replies: 3 comments · 4 replies

Uh oh!

wujing81 Sep 3, 2025

Uh oh!

roger13231 Sep 4, 2025 Author

Uh oh!

roger13231 Sep 4, 2025 Author

Uh oh!

wujing81 Sep 5, 2025

Uh oh!

roger13231 Sep 5, 2025 Author

Uh oh!

wujing81 Sep 5, 2025

Uh oh!

roger13231 Sep 8, 2025 Author

roger13231
Sep 3, 2025

Replies: 3 comments 4 replies

wujing81
Sep 3, 2025

roger13231
Sep 4, 2025
Author

roger13231 Sep 4, 2025
Author

roger13231 Sep 5, 2025
Author

wujing81
Sep 5, 2025

roger13231 Sep 8, 2025
Author