This repository provides the scripts and workflow used to evaluate the performance of foundational machine-learned interatomic potentials (MLIPs) for bcc Fe, as presented in the research article Fine-Tuning Universal Machine-Learned Interatomic Potentials for Applications in the Science of Steels. The scripts are not tied to any specific MLIP; any potential that implements the ASE calculator interface can be evaluated.
- Clone the repo locally.
git clone git@github.com:naveenmohandas/eval_mlip_on_fe.git - Create a virtual environment to install the package (here I use uv, you can use any python package manager of your choice like mamba or conda)
cd eval_mlip_on_fe uv venv uv pip install . - Activate the environment (not needed if running with
uv)source .venv/bin/activate
Note
It is possible that multiple MLIPs cannot be simulteneously installed in the same environement. Use different environements in that case.
Make sure the packages ase, pymatgen, pyyaml, pytorch, seaborn are present in the environemnts.
To evaluate a MLIP model with ase calculator implementated
- Install the MLIP following their installation procedure in the same environment as the
eval_mlip_on_fe. For the example here,uv pip install tensorpotential. - Load the calculator and call the
eval_calculatorfunction. For example for grace the below script can be used to evaluate the mlip:
from tensorpotential.calculator import grace_fm
from eval_mlip_on_fe import eval_calculator
model_name = "GRACE-2L-OMAT-medium-ft-AM"
calc = grace_fm(model_name)
eval_calculator(calc, model_name)
- save the above script in a file
eval_grace.pyin the same directory or sub-directory. - run
uv run eval_grace.py
The results would be saved in a directory called output with sub-directory name model_name.
Note
Some properties like
If you would like to only calculate some of the properties you can pass it as a key word arguement. For example if you only want lattice constant and elastic_tensors
from eval_mlip_on_fe import eval_calculator
eval_properties = ['lattice_constant','elastic_tensors' ]
eval_calculator(calc, model_name, eval_properties = eval_properties)
To get the list of implemented properties
from eval_mlip_on_fe import get_impletemented_properties,
print("The following properties can be predicted for Fe:")
get_impletemented_properties()
The fine-tuned potentials and the dataset referenced in the article can be downloaded here. The below section shows an easier way to rerun the calculations when evaluating multiple models of the same type.
For CHGNet, Sevennet and MACE, to make evaluations easier, a config yaml file
can be used. An example template is given below. calc_name is the name of the
directory in which the output files will be saved in the output directory.
calc_path is the path to the fine-tuned model.
For CHGNet, MACE and Sevennet, model
names can be used in calc_path to load foundational models. It evaluates the
models in series. The yaml file name by default is exp_info_{model_type}.yaml
where model_type is chgnet, mace or sevennet.
A template for exp_info_chgnet.yaml is given below:
1:
calc_name: CHG2
calc_path: 0.2.0
2:
calc_name: CHG3
calc_path: 0.3.0
To predict the properties related to Fe such as Lattice parameter, vacancy formation energy, and so on.
run_mlip_evaluation --model_type chgnet --eval_mlip_on_fe
The script loads the model related info from the exp_info_{model_type}.yaml
Important
If a relaxtion is not converged it raises an Error.
run_mlip_evaluation --eval_db --model_type chgnet --db_path {path_to_mptrj_dataset} --db_name mptrj --is_dft_energy_per_atom True --dft_energy_key energy_per_atom
--model_type is for selecting the parent model
--db_path gives the path to pickle file with list of structures
--db_name a name to be used as prefix while saving the resulting pickle file
{path_to_mptrj_dataset}give the path to the mptrj dataset in .xyz format.
By default the model saves the files in output/{model_name} with the model_name from the input yaml file.
Note
The additional flag of --dft_energy_per_atom for chgnet is because it uses corrected energy per atom and in the dataset I use it is saved with key energy_per_atom. Further in my dataset the dft energy is saved as energy per atom, so additional tag --is_dft_energy_per_atom needs to be set to True. For mace and other mlips, it by default uses uncorrected_total_energy. If using some other dataset specify the dft_energy_key. After predicting for all structures it saves the result-dictionary as a pickle file with keys dft_energy_per_atom and mlip_energy_per_atom.
run_mlip_evaluation --model_type mace --db_path {path_to_mptrj_dataset} --db_name mptrj
run_mlip_evaluation --model_type mace --db_path {path_to_dataset} --db_name fe
{path_to_dataset}give the path to the dataset in .xyz format.
run_mlip_evaluation.py --eval_db --get_rmse --eval_dir output/
Evaluates all the dirs. There is option to give the file names with args --mptrj_file_name and --fe_file_name. By default it looks for
mprtj_mlip_dft_energy.pkl and fe_mlip_dft_energy.pkl.
- For running the predictions on fe_properties
--model_type--model_yaml_file--eval_mlip_on_fe--eval_config
Example usage:
run_mlip_evaluation --model_type chgnet --model_yaml_file filepath --eval_mlip_on_fe --eval_fe_imp
-
For predicting the energies of list of structures saved as a pickle file expects dft energy saved in it.
--model_type--eval_db--db_name--db_path--dft_energy_key--is_dft_energy_per_atom(to be used if the energy corresponding to the dft_energy_key is eV/atom)
-
for getting the RMSE values from the saved mlip_dft_energy.pkl
--eval_rmse--eval_dir--mlip_rmse_key--dft_rmse_key--mprtj_file_name--fe_file_name
The results.ipynb can be used to visualise the data after simulations are done.
Note
If you would just like to visualise the output from the publication you can download the
results from data repository and save the output directory in the same directory as
the jupyter notebook results.ipynb and directory plot_scripts to make the plots.