Skip to content

naveenmohandas/eval_mlip_on_fe

Repository files navigation

About

This repository provides the scripts and workflow used to evaluate the performance of foundational machine-learned interatomic potentials (MLIPs) for bcc Fe, as presented in the research article Fine-Tuning Universal Machine-Learned Interatomic Potentials for Applications in the Science of Steels. The scripts are not tied to any specific MLIP; any potential that implements the ASE calculator interface can be evaluated.

Usage

  • Clone the repo locally. git clone git@github.com:naveenmohandas/eval_mlip_on_fe.git
  • Create a virtual environment to install the package (here I use uv, you can use any python package manager of your choice like mamba or conda)
    cd eval_mlip_on_fe
    uv venv
    uv pip install .
    
  • Activate the environment (not needed if running with uv)
    source .venv/bin/activate
    

Note

It is possible that multiple MLIPs cannot be simulteneously installed in the same environement. Use different environements in that case. Make sure the packages ase, pymatgen, pyyaml, pytorch, seaborn are present in the environemnts.

Example snippets

To evaluate a calculator

To evaluate a MLIP model with ase calculator implementated

  • Install the MLIP following their installation procedure in the same environment as the eval_mlip_on_fe. For the example here, uv pip install tensorpotential .
  • Load the calculator and call the eval_calculator function. For example for grace the below script can be used to evaluate the mlip:
from tensorpotential.calculator import grace_fm
from eval_mlip_on_fe import eval_calculator

model_name = "GRACE-2L-OMAT-medium-ft-AM"
calc = grace_fm(model_name)  
eval_calculator(calc, model_name)
  • save the above script in a file eval_grace.py in the same directory or sub-directory.
  • run uv run eval_grace.py

The results would be saved in a directory called output with sub-directory name model_name.

Note

Some properties like $C_P$ and $\alpha_L$ require MD simulations by default it runs for 100000 steps with 432 atoms, thus could be computationally expensive for large models.

If you would like to only calculate some of the properties you can pass it as a key word arguement. For example if you only want lattice constant and elastic_tensors

from eval_mlip_on_fe import eval_calculator


eval_properties = ['lattice_constant','elastic_tensors' ]
eval_calculator(calc, model_name, eval_properties = eval_properties)

To get the list of implemented properties

from eval_mlip_on_fe import get_impletemented_properties, 

print("The following properties can be predicted for Fe:")
get_impletemented_properties()

For reproducing results from the article

The fine-tuned potentials and the dataset referenced in the article can be downloaded here. The below section shows an easier way to rerun the calculations when evaluating multiple models of the same type.

CHGNet, Sevennet and MACE

For CHGNet, Sevennet and MACE, to make evaluations easier, a config yaml file can be used. An example template is given below. calc_name is the name of the directory in which the output files will be saved in the output directory. calc_path is the path to the fine-tuned model. For CHGNet, MACE and Sevennet, model names can be used in calc_path to load foundational models. It evaluates the models in series. The yaml file name by default is exp_info_{model_type}.yaml where model_type is chgnet, mace or sevennet.

A template for exp_info_chgnet.yaml is given below:

1:
  calc_name: CHG2
  calc_path: 0.2.0 
2:
  calc_name: CHG3
  calc_path: 0.3.0 

Running run_mlip_evaluation

Predict Fe properties

To predict the properties related to Fe such as Lattice parameter, vacancy formation energy, and so on.

run_mlip_evaluation --model_type chgnet --eval_mlip_on_fe

The script loads the model related info from the exp_info_{model_type}.yaml

Important

If a relaxtion is not converged it raises an Error.

Performance on a dataset

CHGNET and MPTRJ dataset

run_mlip_evaluation --eval_db --model_type chgnet --db_path {path_to_mptrj_dataset} --db_name mptrj --is_dft_energy_per_atom True --dft_energy_key energy_per_atom

--model_type is for selecting the parent model --db_path gives the path to pickle file with list of structures --db_name a name to be used as prefix while saving the resulting pickle file {path_to_mptrj_dataset}give the path to the mptrj dataset in .xyz format. By default the model saves the files in output/{model_name} with the model_name from the input yaml file.

Note

The additional flag of --dft_energy_per_atom for chgnet is because it uses corrected energy per atom and in the dataset I use it is saved with key energy_per_atom. Further in my dataset the dft energy is saved as energy per atom, so additional tag --is_dft_energy_per_atom needs to be set to True. For mace and other mlips, it by default uses uncorrected_total_energy. If using some other dataset specify the dft_energy_key. After predicting for all structures it saves the result-dictionary as a pickle file with keys dft_energy_per_atom and mlip_energy_per_atom.

MACE and MPTRJ dataset

run_mlip_evaluation --model_type mace --db_path {path_to_mptrj_dataset} --db_name mptrj

For other dataset

run_mlip_evaluation --model_type mace --db_path {path_to_dataset} --db_name fe

  • {path_to_dataset}give the path to the dataset in .xyz format.
Getting the RMSE after predicting the runs.

run_mlip_evaluation.py --eval_db --get_rmse --eval_dir output/

Evaluates all the dirs. There is option to give the file names with args --mptrj_file_name and --fe_file_name. By default it looks for mprtj_mlip_dft_energy.pkl and fe_mlip_dft_energy.pkl.

List of args:

  • For running the predictions on fe_properties
    • --model_type
    • --model_yaml_file
    • --eval_mlip_on_fe
    • --eval_config

Example usage: run_mlip_evaluation --model_type chgnet --model_yaml_file filepath --eval_mlip_on_fe --eval_fe_imp

  • For predicting the energies of list of structures saved as a pickle file expects dft energy saved in it.

    • --model_type
    • --eval_db
    • --db_name
    • --db_path
    • --dft_energy_key
    • --is_dft_energy_per_atom (to be used if the energy corresponding to the dft_energy_key is eV/atom)
  • for getting the RMSE values from the saved mlip_dft_energy.pkl

    • --eval_rmse
    • --eval_dir
    • --mlip_rmse_key
    • --dft_rmse_key
    • --mprtj_file_name
    • --fe_file_name

Making plots

The results.ipynb can be used to visualise the data after simulations are done.

Note

If you would just like to visualise the output from the publication you can download the results from data repository and save the output directory in the same directory as the jupyter notebook results.ipynb and directory plot_scripts to make the plots.

About

This repository contains functions to evaluate an MLIP's performance on Fe properties.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors