Good AIMD RMSE but unstable MD; adding distortions breaks TRAIN error without fixing MD stability #1342

Abdelazim-Abdelgawwad · 2025-12-28T14:07:04Z

Abdelazim-Abdelgawwad
Dec 28, 2025

Summary

I am training MACE on a 10 ps DFT-AIMD trajectory.
Although the model shows reasonable RMSE on AIMD data, MD with the trained MACE potential is unstable (structure distortion and temperature blow-up). Adding distorted geometries to the training set does not resolve the MD instability and instead severely degrades training metrics.

Case 1: AIMD-only training (10 ps)

Training/validation/test split by continuous time blocks.

TRAIN: RMSE F = 25.8 meV/Å
VALID: RMSE F = 23.4 meV/Å
TEST : RMSE F = 22.7 meV/Å

Despite these errors, running MD with this model leads to:

rapid structural distortion
unphysical temperature increase
instability even with smaller timesteps

Case 2: AIMD + distorted geometries (distortions added only to TRAIN)

TRAIN: RMSE F = 6814.1 meV/Å
VALID: RMSE F =   27.3 meV/Å
TEST : RMSE F =   25.9 meV/Å

Observations:

Validation and test errors remain reasonable
Training force RMSE becomes extremely large
MD remains unstable (same failure mode as AIMD-only model)

Training settings

    --foundation_model="small" \
    --energy_key="REF_energy" \
    --multiheads_finetuning=False \
    --forces_key="REF_forces" \
    --model="MACE" \
    --E0s="average" \
    --num_channels=256 \
    --max_L=2 \
    --correlation=3 \
    --max_num_epochs=500 \
    --batch_size=10 \
    --patience=50 \
    --valid_batch_size=10 \
    --lr=0.001 \
    --energy_weight=1.0 \
    --forces_weight=100.0 \
    --weight_decay=1e-8 \
    --error_table='PerAtomMAE' \
    --ema \
    --ema_decay=0.99 \
    --amsgrad \
    --restart_latest \
    --default_dtype="float64" \
    --device=cuda \
    --seed=1 \
    --scaling='rms_forces_scaling' \
    --save_cpu

Questions

Is it expected that reasonable AIMD RMSE does not guarantee MD stability, even for short trajectories?
Is adding distorted geometries to TRAIN the correct strategy for improving MD stability, or should they be treated/weighted differently?
How should one interpret very large TRAIN force RMSE when VALID/TEST remain low?
Are there recommended practices (e.g. weighting, config types, active learning) for stabilizing MD in this situation?

Goal

My goal is not to replace AIMD, but to:

obtain a MACE potential that can safely reproduce AIMD dynamics

Any guidance on best practices for this workflow would be greatly appreciated. because I am a new user for MACE

Answered by gabor1

Dec 31, 2025

It's also possible that both your aimd and single point calculations have the gradient rather than for force (negative gradient)

View full answer

gabor1 · 2025-12-28T14:48:25Z

gabor1
Dec 28, 2025
Maintainer

Hello! thanks for trying our code. We'll help you diagnose the problem.

On the face of it, your strategy is not wrong. Generally we find that potentials trained only on short AIMD trajectories are stable when you try them for MD of the same composition and temperature (I say short because MD is very correlated, so you may have 10ps, 10K frames, perhaps only 20-30 of them are actually independent!). And yes, adding distorted structures (or frames from higher temperature MD runs, different compositions) does generally help stability.

However, the fact that adding new structures to your training set increases the training RMSE so dramatically suggests to me that there is incompatibility between your two different types of data. To test this, take one of your frames from your AIMD that you put into the training set, and recompute its energy and forces it with the exact settings you used for the distorted configs, compare the energy, forces and stress components between the original AIMD-derived data and your single point recompilation.

Also, it would help if you posted the kinds of plots the MACE training outputs, i.e. reference vs. mace predictions for energies, force components and stress components (even better if you colour separately the points from the AIMD and the distorted structures in case of your second training)

0 replies

Abdelazim-Abdelgawwad · 2025-12-28T16:47:33Z

Abdelazim-Abdelgawwad
Dec 28, 2025
Author

Hello,

Thank you very much for your reply and for sharing your excellent code.

I believe the issue with the very large training force error (TRAIN: RMSE F ≈ 6814 meV/Å) was caused by including highly distorted configurations in the training set. These structures contained broken bonds or atoms that had effectively “flown away,”. I have since removed those configurations and instead added MD frames generated at higher but controlled temperatures (600 K and 900 K).

With this revised dataset, the training now converges to much more reasonable values. For example, at epoch 2009 I obtain: loss ≈ 0.0010, MAE energy per atom ≈ 28.35 meV, and MAE force ≈ 12.32 meV/Å.

As you suggested, I also selected a single configuration from train set and performed a single-point calculation using the same settings and the result was almost the same.

That said, I am still struggling to understand whether I should train entirely from scratch or start from a small foundation model. I have tried both approaches, but I observe similar behavior in each case for the instability of MD. Additionally, I would appreciate your guidance on dataset size: does it matter significantly whether I use a smaller split (e.g., 800 training, 100 validation, 100 test frames) versus a larger one (e.g., 4000 training, 500 validation, 500 test frames)?

Finally, I did not fully understand what you meant by the “kinds of plots” produced by MACE during training—specifically, reference versus MACE predictions for energies, force components, and stress components. I am not considering stress, as my systems are non-periodic and I do not require PBCs. I apologize if this is a basic question; I am still a new user.

Thank you again for your time and support.

Kind regards,
Abdelazim

0 replies

gabor1 · 2025-12-28T18:27:45Z

gabor1
Dec 28, 2025
Maintainer

Your energy RMSE is very high. Typically decent potentials achieve a couple of meV/atom, and often you can get it down to ~ 1 meV/atom (but that requires converged basis sets, no CP2K, and elimination of all noise from your training data). what electronic structure code are you using?

We find that you will get better models when you fine-tune a foundation model, but it does require a little bit of experience (you are managing both the old data and the new data), multiple heads on the model (one for parts of the old data, one for the new data), so my suggestion is that you first get a model that is reasonable when trained from scratch, and then you can try to improve on it by fine-tuning.

More data will help get you lower errors, especially if the new data is uncorrelated with the previous data. The amount depends on what accuracy you are looking for.

You are using "average" E0s (i.e. the energies for isolated atoms), rather than DFT computed isolated atom energies. The latter are important if you want the potential to describe bond breaking, dissociations, and not using isolated atom energies is indeed why your training was much worse when you included broken configurations where atoms "flew away".

When you train a mace, it produces parity plots (references vs mace predictions) for energies and forces (and stresses if you have them, not if you don't). You don't have to use these, you can make your own. upload such plots here, it will help a lot to understand what you are trying to do, and how to get the energy error down.

0 replies

Abdelazim-Abdelgawwad · 2025-12-30T19:41:21Z

Abdelazim-Abdelgawwad
Dec 30, 2025
Author

Hi again,

I am using TeraChem.

Following your suggestion, I ran three different tests. My goal at this stage is to obtain a reasonable baseline model before moving on to fine-tuning. All models were trained from scratch. I also included the isolated DFT energy (not an average) in all cases. The results are summarized below.

1) 4000 AIMD structures for training; 500 for validation and 500 for testing

Results:

| train_Default |  9.5  | 19.8 |  5.93 |
| valid_Default | 19.8  | 17.0 | 12.57 |

2) 4000 AIMD structures + 500 distorted structures for training

Results:

| train_Default | 45.9 | 25.7 |  5.92 |
| valid_Default | 45.4 | 17.6 | 13.02 |

3) 800 AIMD structures + 10 distorted structures for training; 100 for validation and 100 for testing

Results:

| train_Default | 11.5 | 49.0 | 13.89 |
| valid_Default | 19.5 | 32.5 | 23.85 |

Note: I have several additional AIMD trajectories of different structures, approximately 10 ps each. If incorporating these structures could improve the model quality or stability, I can include them in the training set.

I would appreciate your feedback on which direction seems most promising and how best to proceed.

0 replies

gabor1 · 2025-12-31T12:57:03Z

gabor1
Dec 31, 2025
Maintainer

You have some very strong negative correlations. My theory is that when you fit to aimd data you indeed have the forces in the dataset but when you do the single point calculations on the distorted structured, you have the negative of the force, I.e. the gradient.

To confirm Please do the test I suggested where you compare the two electronic structure calculations, one from the aimd and one with you single point calculations on exactly the same structure

0 replies

gabor1 · 2025-12-31T12:58:06Z

gabor1
Dec 31, 2025
Maintainer

It's also possible that both your aimd and single point calculations have the gradient rather than for force (negative gradient)

0 replies

Abdelazim-Abdelgawwad · 2025-12-31T19:42:36Z

Abdelazim-Abdelgawwad
Dec 31, 2025
Author

Thank you very much. You were completely right, I was using gradients for both the AIMD and the single-point calculations.

I have now achieved an energy MAE of ~0.1 meV/atom and a force RMSE of ~16 meV/Å (I plan to increase the number of epochs to further reduce the force MAE). The MD simulations are now running stably.

I have one additional question: if I want to retrain the model using new data, is it acceptable to continue training from the existing checkpoint, or is it better practice to restart from the beginning and include all structures (old and new) in a single training set?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Good AIMD RMSE but unstable MD; adding distortions breaks TRAIN error without fixing MD stability #1342

Uh oh!

{{title}}

Uh oh!

Replies: 7 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Good AIMD RMSE but unstable MD; adding distortions breaks TRAIN error without fixing MD stability #1342

Uh oh!

Abdelazim-Abdelgawwad Dec 28, 2025

Case 1: AIMD-only training (10 ps)

Case 2: AIMD + distorted geometries (distortions added only to TRAIN)

Training settings

Questions

Goal

Replies: 7 comments

Uh oh!

gabor1 Dec 28, 2025 Maintainer

Uh oh!

Abdelazim-Abdelgawwad Dec 28, 2025 Author

Uh oh!

gabor1 Dec 28, 2025 Maintainer

Uh oh!

Abdelazim-Abdelgawwad Dec 30, 2025 Author

Uh oh!

gabor1 Dec 31, 2025 Maintainer

Uh oh!

gabor1 Dec 31, 2025 Maintainer

Uh oh!

Abdelazim-Abdelgawwad Dec 31, 2025 Author

Abdelazim-Abdelgawwad
Dec 28, 2025

gabor1
Dec 28, 2025
Maintainer

Abdelazim-Abdelgawwad
Dec 28, 2025
Author

gabor1
Dec 28, 2025
Maintainer

Abdelazim-Abdelgawwad
Dec 30, 2025
Author

gabor1
Dec 31, 2025
Maintainer

gabor1
Dec 31, 2025
Maintainer

Abdelazim-Abdelgawwad
Dec 31, 2025
Author