The RMSE of energy/atom and force increased with tamperatrue of AIMD #2252

xuyangmark · 2023-01-14T12:50:32Z

xuyangmark
Jan 14, 2023

(1) I used the AIMD method to simulate sulfide materials at 800 K, 1000 K, 1200 K, 1400 K and 1600 K using NVT ensemble for 80 ps, respectively. Then I trained the whole data to generate graph-compress.pb, and find the result is poor. So I trained the data at each temperature, and find that The RMSE of energy/atom and force increased with temperature. I reruned the AIMD at 1600 K for twice, the RMSE is also larger than that at lower temperautre.

The AIMD parameters are all the same, excepts temperatures, and the training results differs. Is this phenomenon normal? How to understand this? The training steps are set to be only 20000. However, when I increased the training step to 100000, the result at 1600 K are still very poor, as saying “Garbage in, garbage out”. What should I do?

(2) AIMD parameters are shown:
INCAR:
######################
NCORE = 16
ISTART = 1 # whether or not to read the WAVECAR file.
ICHARG = 1 # how VASP constructs the initial charge density.
LWAVE = .FALSE. # whether the wavefunctions are written to the WAVECAR file
LCHARG = .FALSE. #
LVTOT = .FALSE.
LVHAR = .FALSE.
LELF = .FALSE.
NCUT = 400
ISMEAR = 0
SIGMA = 0.05
EDIFF = 1E-6
NELMIN = 5
NELM = 300
GGA = PE
LREAL = Auto
ISYM = 0
NSW = 40000
POTIM = 2
SMASS = 0
MDALGO = 2
TEBEG = 1400
TEEND = 1400
##################
KPOINTS
##################
Auto
0
G
1 1 1
0. 0. 0.
##################
The cutoff energy is 400 eV, maybe not enough, but I just find this point after the simulations finished. If I redo the AIMD, should I run it with higher cutoff energy? And how long would be OK? 30 ps, 60 ps or 80 ps?

PS: input.json is below:
###########################
{
"_comment": " model parameters",
"model": {
"type_map": ["K", "Sb", "S"],
"descriptor" :{
"type": "se_e2_a",
"sel": [50, 50, 50],
"rcut_smth": 0.50,
"rcut": 6.00,
"neuron": [25, 50, 100],
"resnet_dt": false,
"axis_neuron": 16,
"seed": 1,
"_comment": " that's all"
},
"fitting_net" : {
"neuron": [100, 100, 100],
"resnet_dt": true,
"seed": 1,
"_comment": " that's all"
},
"_comment": " that's all"
},

"learning_rate" :{
"type":		"exp",
"decay_steps":	100,
"start_lr":	0.001,	
"stop_lr":	3.51e-8,
"_comment":	"that's all"
},

"loss" :{
"type":		"ener",
"start_pref_e":	0.02,
"limit_pref_e":	1,
"start_pref_f":	1000,
"limit_pref_f":	1,
"start_pref_v":	0,
"limit_pref_v":	0,
"_comment":	" that's all"
},

"training" : {
"training_data": {
    "systems":     ["./training_data"],
    "batch_size":  "auto",
    "_comment":	   "that's all"
},
"validation_data":{
    "systems":	   ["./validation_data"],
    "batch_size":  "auto",
    "numb_btch":   1,
    "_comment":	   "that's all"
},
"numb_steps":	20000,
"seed":		10,
"disp_file":	"lcurve.out",
"disp_freq":	200,
"save_freq":	5000,
"_comment":	"that's all"
},    

"_comment":		"that's all"

}
############################
I am eagerly waiting for your kind reply, and thank you very much.

Answered by owen-rett

Jan 14, 2023

This behavior is similar to what I've seen using every MLIAP I've ever run. The configurational space sampled at 50 K is simply smaller than the configurational space sampled at 1600 K. As such, at 1600 K the potential is asked to predict the forces and energies of a more diverse set of structures, and will always have more difficulty doing so, whereas at 50 K the structures sampled in a MD run are going to be nearer to the ground state, and thus easier to predict. To an extent, you're just going to need a fairly large amount of data to allow for accurate interpolation at 1600 K.

If you want to remedy this, you'll need a more diverse training set. You could run additional high(er) tempera…

View full answer

owen-rett · 2023-01-14T20:21:01Z

owen-rett
Jan 14, 2023

This behavior is similar to what I've seen using every MLIAP I've ever run. The configurational space sampled at 50 K is simply smaller than the configurational space sampled at 1600 K. As such, at 1600 K the potential is asked to predict the forces and energies of a more diverse set of structures, and will always have more difficulty doing so, whereas at 50 K the structures sampled in a MD run are going to be nearer to the ground state, and thus easier to predict. To an extent, you're just going to need a fairly large amount of data to allow for accurate interpolation at 1600 K.

If you want to remedy this, you'll need a more diverse training set. You could run additional high(er) temperature ab initio molecular dynamics, and then add structures generated in that manner to the training set, or you could run an active learning-style algorithm, eg the DP-Gen [1] algorithm (listing here because its designed to interface with DeepMD-Kit) to generate a more diverse training set. If you want to go down the path of rerunning AIMD, I've generally seen doing a large number of short AIMD runs as preferable to running a few long AIMD runs in literature, though I'd recommend an active learning type approach, as you can generally fill in the gaps in the training set using less DFT time that way. Also, out of curiosity, what's the size both of the AIMD simulations (in number of atoms), and the number of structures you're training on.

[1] Zhang, Yuzhi, et al. "DP-GEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models." Computer Physics Communications 253 (2020): 107206.

I am not a dev of the project, and thus can't really comment deeply on the DeepMD-kit parameters used here. Though, in every sample training script I've seen for DeepMD-Kit, the max number of training steps is on the order of 10^6 rather than 2*10^4 as used here, so it could be that the potential isn't fully trained yet. In my own case, I still see improvement to the validation forces up to around 250k training steps. A useful metric for looking at this could be plotting the training loss (in lcurve.out) against the epoch number, and seeing if its still decreasing at the time training finishes.

Edit, the cutoff seems fine; generally VASP wants 1.3-1.5x the max cutoff for running AIMD/Relaxation, and it appears you've satisfied that assuming you're using standard pseudopotentials, though definitely double check that. Depending on the size of the simulation, I might worry a tiny bit about the density of the KPOINT mesh, but I don't think it's the root of the problem here.

3 replies

xuyangmark Jan 15, 2023
Author

Thank you very much for your patient and meticulous answer, which has benefited me a lot. I re-trained data with 100k steps, and the RMSE decreased. I am still learning how to use DP-GEN, and the GPU servers won't be deployed until summer. Maybe I can try that way later.

The ABC values are 10.86, 11.56 and 15.67 A, with alpha_beta_gama 90, 90, 90. The model has 64 atoms, and the AIMD is running via vasp_gam instead of vasp_std. I don't know what AIMD requires for the system size, and I don't know if such a setting is reasonable?

Thanks again for your advice!!!

owen-rett Jan 16, 2023

Based on the system size, and the vasp settings given, I doubt they are the source of the large error at high temperature. I'd definitely recommend training to a rather large total step size (1,000,000 is fairly common) to see if that can help you to further reduce error.

The info linked in the FAQ in the other answer should also be helpful.

xuyangmark Jan 18, 2023
Author

Yes, I'm runing a total step of 3,000,00, and it has not stopped. Hope to have a good results.

njzjz · 2023-01-14T22:25:30Z

njzjz
Jan 14, 2023
Maintainer

Please read FAQ:Why does a model have low precision:

One should be aware that the errors of some data is also affected by the absolute values of this data. Stable structures tend to be more precise than unstable structures because unstable structures may have larger forces. Also, errors will be introduced in the Projector augmented wave (PAW) DFT calculations when the atoms are very close due to the overlap of pseudo-potentials. It is expected to see that data with large forces has larger errors and it is better to compare different models only with the same data.

1 reply

xuyangmark Jan 15, 2023
Author

Thanks very much. I will read the FAQ carefully. BTW, happy Chinese new year.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The RMSE of energy/atom and force increased with tamperatrue of AIMD #2252

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The RMSE of energy/atom and force increased with tamperatrue of AIMD #2252

Uh oh!

xuyangmark Jan 14, 2023

Replies: 2 comments · 4 replies

Uh oh!

Uh oh!

owen-rett Jan 14, 2023

Uh oh!

xuyangmark Jan 15, 2023 Author

Uh oh!

owen-rett Jan 16, 2023

Uh oh!

xuyangmark Jan 18, 2023 Author

Uh oh!

njzjz Jan 14, 2023 Maintainer

Uh oh!

xuyangmark Jan 15, 2023 Author

xuyangmark
Jan 14, 2023

Replies: 2 comments 4 replies

owen-rett
Jan 14, 2023

xuyangmark Jan 15, 2023
Author

xuyangmark Jan 18, 2023
Author

njzjz
Jan 14, 2023
Maintainer

xuyangmark Jan 15, 2023
Author