GitHub - msehabibur/defect_GNN_gen_1: Accelerating Defect Prediction in Semiconductors Using Graph Neural Networks

Accelerating Defect Prediction in Semiconductors Using Graph Neural Networks

First principles computations reliably predict the energetics of point defects in semiconductors, but are constrained by the expense of using large supercells and advanced levels of theory. Machine learning models trained on computational data, especially ones that sufficiently encode defect coordination environments, can be used to accelerate defect predictions.

Here, we develop a framework for the prediction and screening of native defects and functional impurities in a chemical space of Group IV, III-V, and II-VI zinc blende (ZB) semiconductors, powered by crystal Graph-based Neural Networks (GNNs) trained on high-throughput density functional theory (DFT) data. Using an innovative approach of sampling partially optimized defect configurations from DFT calculations, we generate one of the largest computational defect datasets to date, containing many types of vacancies, self-interstitials, anti-site substitutions, impurity interstitials and substitutions, as well as some defect complexes.

We applied three types of established GNN techniques, namely Crystal Graph Convolutional Neural Network (CGCNN), Materials Graph Network (MEGNET), and Atomistic Line Graph Neural Network (ALIGNN), to rigorously train models for predicting defect formation energy (DFE) in multiple charge states and chemical potential conditions. We find that ALIGNN yields the best DFE predictions with root mean square errors around 0.3 eV, which represents a prediction accuracy of 98% given the range of values within the dataset, improving significantly on the state-of-the-art.

Models are tested for different defect types as well as for defect charge transition levels. We further show that GNN-based defective structure optimization can take us close to DFT-optimized geometries at a fraction of the cost of full DFT. DFT-GNN models enable prediction and screening across thousands of hypothetical defects based on both unoptimized and partially-optimized defective structures, helping identify electronically active defects in technologically-important semiconductors.

To train the GNN models, the following resources were used:

Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties— https://github.com/txie-93/cgcnn
Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals — https://github.com/materialsvirtuallab/megnet
Atomistic Line Graph Neural Network for improved materials property predictions— https://github.com/usnistgov/alignn

ALIGNN hyperparameters were used in the paper:

train_ratio: 0.6 (60% of the data is used for training)
val_ratio: 0.2 (20% of the data is used for validation)
test_ratio: 0.2 (20% of the data is used for testing)
epochs: 100-120 (The model is trained for 100-120 epochs)
batch_size: 8-32 (Size of each batch of data during training)
weight_decay: 1e-05 (Regularization parameter to prevent overfitting)
learning_rate: 0.001 (Step size for the optimization algorithm)
optimizer: 'adamw' (Optimizer used for training, a variant of Adam optimizer)
cutoff: 8.0 (Cutoff distance for graph construction)
max_neighbors: 12 (Maximum number of neighbors for each atom)
alignn_layers: 4 (Number of ALIGNN layers)
gcn_layers: 4 (Number of GCN layers)
atom_input_features: 92 (Number of atom input features)
edge_input_features: 80 (Number of edge input features)
triplet_input_features: 40 (Number of triplet input features)
embedding_features: 64 (Number of embedding features)
hidden_features: 256 (Number of hidden features)

All the data (crystals) being used to train the model are added to this repository. Also, the script to perform gradient-free energy minimization is also added. Additionally, the checkpoint of the optimized ALIGNN models trained on Dataset (1+2+3+4) is included, which could be leveraged to predict unoptimized formation energies.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
Codes		Codes
Dataset_1		Dataset_1
Dataset_2		Dataset_2
Dataset_3		Dataset_3
Dataset_4		Dataset_4
HT-DFT_Screening		HT-DFT_Screening
defect_migration		defect_migration
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accelerating Defect Prediction in Semiconductors Using Graph Neural Networks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Accelerating Defect Prediction in Semiconductors Using Graph Neural Networks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages