PET training tips #762
lucasdekam
started this conversation in
General
Replies: 1 comment 5 replies
-
|
Hi @lucasdekam, I will let @abmazitov and @frostedoyster follow up, but the short story is that we have done quite an extensive study of all this, resulting in an improved PET architecture and training parameters that are slowly being prepared for merging. Another (independent) thing that helps a lot is to set up a "non-conservative pre-training" step, cf. https://atomistic-cookbook.org/examples/pet-finetuning/pet-ft-nc.html that cuts down dramatically the training time. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I like the idea of the metatensor framework so I wanted to try out training a model. Since the published results suggest that PET is quite competitive with MACE/NequIP, I'm trying out PET. I'm training on a VASP RPBE-D3 dataset of 500 configurations with 400 atoms each (metal-water interfaces).
I'm using these options:
I find training to be quite slow; I'm now at about 1500 epochs and the error starts to approach the MACE force RMSE (PET train: 21 meV/A, PET valid: 23 meV/A, MACE train: 16 meV/A, MACE valid: 20 meV/A). For MACE a few hundred epochs were plenty. It also seems that the learning rate needs to be very low for the error to decrease at all. When training from scratch (not fine-tuning), it's even more difficult to get the error down.
Are these observations a result of the architecture of PET (large number of parameters, little a priori structure compared to the ACE basis used for example by MACE), or am I doing something wrong/suboptimal in the training?
I also quite like the feature available in the
gracemakerpackage where the learning rate is only decreased when the validation error stops decreasing for a set number of epochs. I feel like that would help for training here too. But maybe there's a particular reason why you opted for another (more effective?) strategy?If anyone has experience with a similar kind of dataset and/or has any ideas for things I can try, let me know.
Thanks and keep up the good work,
Lucas
Beta Was this translation helpful? Give feedback.
All reactions