Skip to content

Commit 01dda1f

Browse files
authored
Add files via upload
1 parent 92cbe8c commit 01dda1f

File tree

4 files changed

+302
-0
lines changed

4 files changed

+302
-0
lines changed

source/_posts/abacus_ai.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
title: "What Can ABAUCS Do too? | Using AI to describe the kinetic energy of electrons in semiconductors"
3+
date: 2024-11-18
4+
categories:
5+
- ABACUS
6+
---
7+
8+
Recently, doctoral student Sun Liang and researcher Chen Mohan from the Center for Applied Physics and Technology at Peking University implemented a machine learning-based kinetic energy density functional (multi-channel ML-based Physically-constrained Non-local KEDF, or CPN KEDF) in the domestically developed open-source density functional theory software ABACUS (Atomic-based Ab initio Computation at UStc). This functional employs a multi-channel architecture, extending the previously developed MPN KEDF (ML-based Physical-constrained Non-local KEDF) [1], which was designed for simple metallic systems, to semiconductors. The method achieved promising results in tests on ground-state energy and ground-state charge density, laying the groundwork for the development of machine learning-based kinetic energy density functionals with broader applicability. The article, titled “Multi-channel machine learning-based nonlocal kinetic energy density functional for semiconductors,” has been published in the journal Electronic Structure (DOI: 10.1088/2516-1075/ad8b8c) [2].
9+
10+
<!-- more -->
11+
12+
## Research Background
13+
14+
### **Orbital Free Density Functional Theory and Kinetic Energy Density Functional**
15+
16+
**Orbital Free Density Functional Theory (OFDFT)** is a highly efficient density functional theory calculation method, benefiting from its computational complexity of \( O(N) \) or \( O(N \ln N) \), where \( N \) is the number of atoms in the system. This enables it to handle systems with millions of atoms. In contrast, the commonly used Kohn–Sham DFT (KSDFT) has a computational complexity generally around \( O(N^3) \), which is limited to processing systems containing thousands of atoms at most. The OFDFT algorithm has been implemented in ABACUS. For more details, refer to the ABACUS online Chinese documentation: [https://mcresearch.github.io/abacus-user-guide](https://mcresearch.github.io/abacus-user-guide).
17+
18+
The lower computational complexity of OFDFT stems from its treatment of the non-interacting kinetic energy as being dependent only on the electron density in the kinetic energy density functional (KEDF). As a result, it avoids the most time-consuming diagonalization of the Kohn–Sham density matrix. However, the total energy (including kinetic energy) is of the same order of magnitude as interaction energies in condensed matter and molecular systems. Therefore, the accuracy of the approximate kinetic energy functional significantly affects the precision of OFDFT calculations.
19+
20+
Over the past few decades, researchers have proposed various kinetic energy functionals suitable for different systems. For instance:
21+
- **Wang-Teter (WT) functional** [3]: applicable to simple metals;
22+
- **Wang-Govind-Carter (WGC) functional** [4]: applicable to simple metals;
23+
- **Huang-Carter (HC) functional** [5]: applicable to semiconductors.
24+
25+
Despite these developments, finding a universal kinetic energy functional capable of describing both simple metals and semiconductors remains a challenge.
26+
27+
### **Machine Learning-Based Kinetic Energy Density Functionals**
28+
29+
In recent years, researchers have begun exploring machine learning to construct kinetic energy density functionals. These methods leverage machine learning models to capture the relationship between the electron density (or related quantities) and kinetic energy. However, there is currently no unified framework for constructing such functionals. To address this, we propose three fundamental requirements for machine learning-based kinetic energy density functionals:
30+
31+
1. **Incorporation of nonlocal information**: Since the non-interacting kinetic energy is a highly nonlocal quantity, machine learning-based functionals should include nonlocal electron density information.
32+
2. **Adherence to physical constraints**: Although the exact form of the non-interacting kinetic energy functional is unknown, several physical properties (e.g., scaling behavior, Pauli contribution, free electron gas [FEG] limit) must be satisfied. Incorporating these physical properties into machine learning models can improve their accuracy and transferability.
33+
3. **Stability in computation**: Numerical stability is essential as the functional serves as the foundation for practical calculations.
34+
35+
Based on these requirements, we constructed the **MPN functional**, expressed as:
36+
37+
\[
38+
\int w(|\mathbf{r} - \mathbf{r}'|) f(\rho(\mathbf{r})) \, \mathrm{d}\mathbf{r},
39+
\]
40+
41+
where \( w(|\mathbf{r} - \mathbf{r}'|) \) is the kernel describing nonlocal interactions, and \( f(\rho(\mathbf{r})) \) is the function of the local electron density. In addition, the MPN functional incorporates penalties and constraints to ensure the three requirements mentioned above, thereby achieving numerical stability in calculations. Tests show that the MPN functional achieves high accuracy in describing simple metals and alloys.
42+
43+
However, we found that the MPN functional struggled to capture the bonding characteristics of semiconductors. This is because, in semiconductors, the charge density distribution differs significantly from that of simple metals: while simple metals exhibit a nearly uniform electron density, semiconductors possess distinct covalent bonding features and residual separated regions. To address this, we introduced a "multi-channel" architecture to better describe the electrons in semiconductors.
44+
45+
## The Design of NN
46+
47+
### **Multi-Channel Architecture**
48+
49+
As shown in the figure, we scaled the kernel function in the MPN functional as:
50+
51+
\[
52+
w_\lambda(r - r') = \lambda^3 w(\lambda(r - r')),
53+
\]
54+
55+
and used this to replace the kernel function in the MPN functional. This allowed the input electron density to be transformed into a new set of descriptors. Thus, the scaled kernel functions define a "channel." When \( \lambda > 1 \), the kernel function is compressed, enabling it to capture local electron information. Conversely, when \( \lambda < 1 \), the kernel function is stretched, making it capable of capturing long-range electron information. Finally, the descriptors from different channels are combined into a new vector, which is input into a neural network with a structure similar to that of the MPN functional to obtain the final kinetic energy density functional.
56+
57+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/abacus_ai/aba1.png# pic_center width="100%" height="100%" /></center>
58+
59+
60+
We named the new functionals **CPN\(_n\)**, where the subscript \(n\) represents the number of channels. To study the impact of different numbers of channels on the accuracy of the functional, we constructed three versions: **CPN\(_1\)**, **CPN\(_3\)**, and **CPN\(_5\)**. The details of the descriptors and parameters are shown in the table below.
61+
62+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/abacus_ai/aba2.png# pic_center width="100%" height="100%" /></center>
63+
64+
65+
### **Loss Function and Training Set**
66+
67+
The loss function for the CPN functional is the same as that used for the MPN functional:
68+
69+
\[
70+
L = \frac{1}{N} \sum_r \left[ \left(\frac{F^\text{NN}_\theta - F^\text{KS}_\theta}{\overline{F^\text{KS}_\theta}}\right)^2 + \left(\frac{V^\text{MPN}_\theta - V^\text{KS}_\theta}{\overline{V^\text{KS}_\theta}}\right)^2 + \left[F^\text{NN}_\text{FEG} - \ln(e - 1)\right]^2 \right],
71+
\]
72+
73+
where \(N\) is the number of spatial grid points, and \(\overline{F^\text{KS}_\theta}\) and \(\overline{V^\text{KS}_\theta}\) are the averages of \(F^\text{KS}_\theta\) and \(V^\text{KS}_\theta\), respectively.
74+
75+
- The first term reinforces the functional with energy-related information, enhancing the numerical stability of the model.
76+
- The second term ensures corrections to the free electron gas (FEG) limit are not excessive, maintaining stability in calculations.
77+
78+
The training set for the CPN functional includes 10 semiconductor systems, covering silicon (Si) in the diamond structure as well as zincblende-structured compounds AIP, AIP, AlSb, GaP, GaAs, GaSb, InP, InAs, and InSb. Each system consists of \(27 \times 27 \times 27\) spatial grid points, totaling 196,830 grid points in the training set.
79+
80+
## Results
81+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/abacus_ai/aba3.png# pic_center width="100%" height="100%" /></center>
82+
83+
The figure above shows the energy-volume curve for Si in the diamond structure. The **CPN\(_1\)** KEDF, which includes only one channel, fails to produce a smooth curve. However, with an increasing number of channels, the accuracy of **CPN\(_3\)** KEDF surpasses that of the WGC KEDF, while **CPN\(_5\)** KEDF achieves accuracy comparable to that of KSDFT and HC KEDF.
84+
85+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/abacus_ai/aba4.png# pic_center width="100%" height="100%" /></center>
86+
87+
This section demonstrates the results obtained for the ground-state electron densities of Si in the diamond structure, GaAs in the zincblende structure, and systems from the training set. Firstly, all three CPN functionals successfully yield smooth electron densities. Secondly, with more channels included, the accuracy of the CPN functionals progressively improves, indicating that the multi-channel architecture effectively enhances the ability of the machine-learning-based functionals to describe semiconductors. Finally, describing covalent bonding remains a significant challenge in kinetic energy density functionals. As shown in the figure, even the HC functional, which is specifically designed for semiconductors, underestimates the electron density in covalent bond regions. However, whether in the training or testing set, **CPN\(_5\)** KEDF with five channels accurately captures and reconstructs covalent bonding structures. This demonstrates that the descriptors and multi-channel architecture effectively capture the characteristics of electron densities in semiconductors.

source/_posts/dftio.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
title: "dftio | Collaborating with the DeepModeling Community to Develop Efficient Electronic Structure Data Processing Tools"
3+
date: 2024-11-11
4+
categories:
5+
- dftio
6+
---
7+
8+
**DFTIO**, initiated by the DeePTB team at the Beijing Institute of Science and Intelligence, is an efficient electronic structure data processing tool designed to convert electronic structure output information from various first-principles/quantum computation software into data formats that are easy for machine learning models to read.
9+
10+
In recent years, machine learning-based first-principles electronic structure models have developed rapidly, including but not limited to machine learning tight-binding models, Hamiltonian models, electronic density models, and functional models. With the advancement of these models, we have increasingly realized that the post-processing of output data from different electronic structure calculation software has become a common challenge for both developers and users.
11+
12+
<!-- more -->
13+
14+
On the one hand, for method developers, considerable time is often required to consult and study documentation for various electronic structure calculation tools and to write data processing scripts, enabling the integration of their machine learning frameworks to support batch data reading. On the other hand, for users, differences in data format definitions between methods can hinder their ability to learn new methods and compare results across different approaches.
15+
16+
As machine learning-accelerated solutions for electronic structure modeling become more widespread, a toolkit that separates data operations from method development could address these issues simultaneously. Therefore, a general post-processing tool capable of interfacing with electronic structure data output from a variety of mainstream first-principles software has become an indispensable bridge between machine learning-based electronic structure method developers and software users.
17+
18+
DFTIO fulfills this need by providing a streamlined and efficient solution to bridge the gap between diverse electronic structure outputs and machine learning frameworks, enhancing usability and accelerating development in this field.
19+
20+
## Functions
21+
22+
The main features of **DFTIO** include: data reading, data manipulation, preprocessing, and postprocessing of outputs from electronic structure software. It is designed to support developers, assist users in data conversion for machine learning datasets and model predictions, and facilitate visualization and analysis. The general framework of DFTIO is shown in the figure below, with specific functions as follows:
23+
24+
**1. Data Reading and Writing**
25+
26+
Based on the target of reading and writing, this can be divided into three categories:
27+
28+
- **DFT data → DFTIO datasets**
29+
- **DFTIO datasets → PyTorch Dataset objects**
30+
- **Machine learning model predictions → Readable DFT software data** (e.g., density grid points, etc.)
31+
32+
**2. Data Manipulation**
33+
34+
Based on the target of manipulation, this can also be divided into two categories:
35+
36+
- **Symbolic matrix data → Tensor-like data**
37+
- **Symbolic matrix or tensor-like data → Grid data, visualizations, analyses, etc.**
38+
39+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/dftio/dftio1.png# pic_center width="100%" height="100%" /></center>
40+
41+
42+
### Data i&o
43+
**DFT Data → DFTIO Dataset**
44+
45+
DFTIO supports converting output data from DFT software into machine learning datasets with a single command. It currently supports two output data formats:
46+
47+
1. **.dat Text Format**:
48+
- Atomic structure information is stored in text format.
49+
- Operator matrices are stored in the h5py database format.
50+
- Easy to read and parse, with high I/O efficiency during training.
51+
- Suitable for training on small to medium-sized datasets.
52+
53+
2. **.lmdb Database Format**:
54+
- Data is stored in compressed binary format.
55+
- Suitable for large-scale datasets (10,000+ configurations).
56+
- Fast read/write speeds with minimal storage overhead.
57+
- Ideal for large-scale data training.
58+
59+
DFTIO’s data conversion functionality supports parallel processing, making it capable of handling large datasets efficiently. To facilitate support for new software, DFTIO separates general functionalities like I/O operations, parallel settings, and data processing from the core data conversion logic. A universal output format and corresponding data interfaces have been defined. As a result, adding support for new software only requires developers to implement a few simple data conversion methods.
60+
61+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/dftio/dftio2.png# pic_center width="100%" height="100%" /></center>
62+
63+
**DFTIO Dataset → PyTorch Dataset Class**
64+
65+
DFTIO includes a ready-to-use PyTorch Dataset class. This dataset class can directly read datasets converted by DFTIO and provides standard interfaces. Developers can focus on designing and optimizing machine learning methods and models while relying on DFTIO’s pre-defined PyTorch Dataset types through dependency packages. Additionally, data processing can be handled using DFTIO’s command-line tools, saving a significant amount of redundant effort.
66+
67+
### Data Manipulation
68+
69+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/dftio/dftio3.png# pic_center width="100%" height="100%" /></center>
70+
71+
The fundamental data of electronic structures typically involves formats such as equivariant vectors (e.g., wavefunction coefficients), equivariant matrix operators (e.g., Hamiltonians, density matrices), and grid-based fields (e.g., wavefunctions, charge densities). For many physical quantities, obtaining one format allows the computation of its equivalent representation in other formats. For instance:
72+
73+
- Once wavefunction coefficients (equivariant vectors) are obtained, they can be combined with the basis set to compute real-space wavefunctions (grid-based fields).
74+
- Once the density matrix (equivariant matrix operator) is available, it can be converted to real-space charge density (grid-based fields).
75+
- Conversely, real-space charge density can be used to derive coefficients expanded on a given basis set (equivariant vectors).
76+
77+
Current machine learning methods often rely on specific data formats and can only work with one fixed format. However, data formats produced by electronic structure software vary between methods. For example, plane-wave-based DFT software struggles to output equivariant vectors/matrix operators in LCAO basis sets. Enabling conversions between different formats can make machine learning methods compatible with a broader range of electronic structure software.
78+
79+
Additionally, data manipulation can help users visualize output data when using machine learning models, making it easier to analyze and interpret physical phenomena.
80+
81+
<center><img src=https://dp-public.oss-cn-beijing.aliyuncs.com/community/dftio/dftio4.png# pic_center width="100%" height="100%" /></center>
82+
83+
## Projects
84+
DFTIO is now open-source in the DeepModeling community. We welcome you to use it or contribute to its development.
85+
86+
DFTIO project repository: https://github.com/deepmodeling/dftio

0 commit comments

Comments
 (0)