Skip to content

Commit 533ca12

Browse files
dyzhengdyzhengclaudemohanchen
authored
Feature: use new format of CSR file for out_mat_hs2 and out_mat_ds, out_mat_t, out_mat_r, out_mat_xc2 (Useful Information for output format update of H(R), S(R) and other matrices that based on NAO basis set) (#6991)
* Feature: use new format of CSR file for out_mat_hs2 * feature: modify out_mat_ds, out_mat_t, out_mat_r, out_mat_xc2 for new CSR format * fix: UT of input * Fix: input parameter yaml has been updated * Refactor: add Per-spin HContainer wrappers for nspin=2 * fix: enable HR comparison in tests and update CSR reference files Enable the previously commented-out H(R) matrix comparison in catch_properties.sh and regenerate all CSR reference files for scf_out_hsr, scf_out_hsr_spin4, and nscf_out_hsr_tr_rr to match the new CSR output format. Add missing hrs1_nao.csr.ref for nscf test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use C++11 compatible unique_ptr::reset instead of std::make_unique The CI build uses C++11 where std::make_unique is not available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update md_out_syns reference files for CSR format changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * update the reference of md_out_syns with gnu compiler * try to decrease the size of output files * fix: replace C++17 structured bindings with C++11-compatible code in write_HS_R.cpp Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * change init_chg from file to hr in test case * add unit tests for Output_HContainer consistency and init_chg=hr error handling - Add write-read round-trip consistency test for Output_HContainer/Read_HContainer - Add sparse threshold filtering, precision parameter, and nspin=2 tests - Add clear error message when HR files missing for init_chg=hr - Update CMakeLists.txt with new test targets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: write empty R-blocks in CSR output to keep nR consistent When all matrix elements of an R-vector are below sparse_threshold, Output_HContainer skipped writing that R-block entirely, but the file header still declared the full nR count from size_R_loop(). This caused csrFileReader to hit EOF when reading HR files with sparse R-blocks (e.g. init_chg=hr), while DM files were unaffected because all R-vectors had nonzero elements. Also made csr_reader more robust by skipping comment/empty lines instead of hardcoding 9 readLine() calls for the CSR format block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Test: set two Si test cases to H2O cases * Fix: multi-process error of read_hcontainer * Fix: md_out_syns test case * Test: change CI test files * Test: add threshold for case md_out_syns * Fix: memory leak of libxc and cal_sync precision * Fix: allow gamma_only with cal_sync * Fix: error of test case --------- Co-authored-by: dyzheng <zhengdy@bjaisi.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Mohan Chen <mohanchen@pku.edu.cn>
1 parent 6d1d4fa commit 533ca12

62 files changed

Lines changed: 3438 additions & 6863 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/advanced/elec_properties/hs_matrix.md

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -26,24 +26,47 @@ The rest of the file contains the upper triangular part of the specified matrice
2626

2727
The output of $H(R)$ and $S(R)$ matrices is controlled by the keyword [out_mat_hs2](../input_files/input-main.md#out_mat_hs2). This functionality is not available for gamma_only calculations. To generate such matrices for gamma only calculations, users should turn off [gamma_only](../input_files/input-main.md#gamma_only), and explicitly specify that gamma point is the only k point in the KPT file.
2828

29-
For single-point SCF calculations, if nspin = 1 or nspin = 4, two files `hrs1_nao.csr` and `sr_nao.csr` are generated, which contain the Hamiltonian matrix $H(R)$ and overlap matrix $S(R)$ respectively. For nspin = 2, three files `hrs1_nao.csr` and `hrs2_nao.csr` and `sr_nao.csr` are created, where the first two files correspodn to $H(R)$ for spin up and spin down, respectively.
29+
### Output Format
3030

31-
Each file or each section of the appended file starts with three lines, the first gives the current ion/md step, the second gives the dimension of the matrix, and the last indicates how many different `R` are in the file.
31+
The H(R) and S(R) matrices are output in standard Compressed Sparse Row (CSR) format, matching the format used by `out_dmr`.
3232

33-
The rest of the files are arranged in blocks. Each block starts with a line giving the lattice vector `R` and the number of nonzero matrix elements, such as:
33+
For single-point SCF calculations:
34+
- **nspin = 1 or nspin = 4**: Two files `hrs1_nao.csr` and `srs1_nao.csr` are generated, containing the Hamiltonian matrix $H(R)$ and overlap matrix $S(R)$ respectively.
35+
- **nspin = 2**: Three files `hrs1_nao.csr`, `hrs2_nao.csr`, and `srs1_nao.csr` are created, where the first two files correspond to $H(R)$ for spin up and spin down, respectively.
3436

37+
### File Structure
38+
39+
Each file starts with a header:
3540
```
36-
-3 1 1 1020
41+
--- Ionic Step 1 ---
42+
# print H matrix in real space H(R)
43+
1 # number of spin directions
44+
1 # spin index
45+
100 # number of localized basis
46+
50 # number of Bravais lattice vector R
47+
48+
[UnitCell information]
49+
50+
#----------------------------------------------------------------------#
51+
# CSR Format #
52+
...
53+
0 0 0 5
54+
# CSR values
55+
1.234e-01 2.345e-02 ...
56+
# CSR column indices
57+
0 5 10 ...
58+
# CSR row pointers
59+
0 3 7 ...
3760
```
3861

39-
which means there are 1020 nonzero elements in the (-3,1,1) cell.
62+
The CSR format stores a sparse m × n matrix M in row form using three arrays (values, column indices, row pointers). According to Wikipedia:
4063

41-
If there is no nonzero matrix element, then the next block starts immediately on the next line. Otherwise, there will be 3 extra lines in the block, which gives the matrix in CSR format. According to Wikipedia:
64+
- The arrays **values** and **column indices** are of length NNZ (number of nonzero entries), and contain the non-zero values and the column indices of those values respectively.
65+
- The array **row pointers** is of length m + 1 and encodes the index where each row starts. The last element is NNZ.
4266

43-
The CSR format stores a sparse m × n matrix M in row form using three (one-dimensional) arrays (V, COL_INDEX, ROW_INDEX). Let NNZ denote the number of nonzero entries in M. (Note that zero-based indices shall be used here.)
67+
### Precision Control
4468

45-
- The arrays V and COL_INDEX are of length NNZ, and contain the non-zero values and the column indices of those values respectively.
46-
- The array ROW_INDEX is of length m + 1 and encodes the index in V and COL_INDEX where the given row starts. This is equivalent to ROW_INDEX[j] encoding the total number of nonzeros above row j. The last element is NNZ , i.e., the fictitious index in V immediately after the last valid index NNZ - 1.
69+
Use `out_mat_hs2 1 12` to output with 12-digit precision (default is 8).
4770

4871
For calculations involving ionic movements, the output frequency of the matrix is controlled by [out_freq_ion](../input_files/input-main.md#out_freq_ion) and [out_app_flag](../input_files/input-main.md#out_app_flag).
4972

docs/advanced/input_files/input-main.md

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -994,8 +994,9 @@
994994
### pw_diag_nmax
995995

996996
- **Type**: Integer
997+
- **Availability**: *basis_type==pw, ks_solver==cg/dav/dav_subspace/bpcg*
997998
- **Description**: Only useful when you use ks_solver = cg/dav/dav_subspace/bpcg. It indicates the maximal iteration number for cg/david/dav_subspace/bpcg method.
998-
- **Default**: 40
999+
- **Default**: 50
9991000

10001001
### pw_diag_ndim
10011002

@@ -1934,12 +1935,12 @@
19341935

19351936
### out_mat_hs2
19361937

1937-
- **Type**: Boolean
1938+
- **Type**: Boolean \[Integer\](optional)
19381939
- **Availability**: *Numerical atomic orbital basis (not gamma-only algorithm)*
19391940
- **Description**: Whether to print files containing the Hamiltonian matrix and overlap matrix into files in the directory OUT.${suffix}. For more information, please refer to hs_matrix.md.
19401941

19411942
> Note: In the 3.10-LTS version, the file names are data-HR-sparse_SPIN0.csr and data-SR-sparse_SPIN0.csr, etc.
1942-
- **Default**: False
1943+
- **Default**: False [8]
19431944
- **Unit**: Ry
19441945

19451946
### out_mat_tk
@@ -1954,42 +1955,42 @@
19541955

19551956
### out_mat_r
19561957

1957-
- **Type**: Boolean
1958+
- **Type**: Boolean \[Integer\](optional)
19581959
- **Availability**: *Numerical atomic orbital basis (not gamma-only algorithm)*
1959-
- **Description**: Whether to print the matrix representation of the position matrix into a file named rr.csr in the directory OUT.${suffix}. If calculation is set to get_s, the position matrix can be obtained without scf iterations. For more information, please refer to position_matrix.md.
1960+
- **Description**: Whether to print the matrix representation of the position matrix into files named rxrs1_nao.csr, ryrs1_nao.csr, rzrs1_nao.csr in the directory OUT.${suffix}. If calculation is set to get_s, the position matrix can be obtained without scf iterations. For more information, please refer to position_matrix.md.
19601961

19611962
> Note: In the 3.10-LTS version, the file name is data-rR-sparse.csr.
1962-
- **Default**: False
1963+
- **Default**: False 8
19631964
- **Unit**: Bohr
19641965

19651966
### out_mat_t
19661967

1967-
- **Type**: Boolean
1968+
- **Type**: Boolean \[Integer\](optional)
19681969
- **Availability**: *Numerical atomic orbital basis (not gamma-only algorithm)*
19691970
- **Description**: Generate files containing the kinetic energy matrix. The format will be the same as the Hamiltonian matrix and overlap matrix as mentioned in out_mat_hs2. The name of the files will be trs1_nao.csr and so on. Also controled by out_freq_ion and out_app_flag.
19701971

19711972
> Note: In the 3.10-LTS version, the file name is data-TR-sparse_SPIN0.csr.
1972-
- **Default**: False
1973+
- **Default**: False 8
19731974
- **Unit**: Ry
19741975

19751976
### out_mat_dh
19761977

1977-
- **Type**: Boolean
1978+
- **Type**: Integer
19781979
- **Availability**: *Numerical atomic orbital basis (not gamma-only algorithm)*
19791980
- **Description**: Whether to print files containing the derivatives of the Hamiltonian matrix. The format will be the same as the Hamiltonian matrix and overlap matrix as mentioned in out_mat_hs2. The name of the files will be dhrxs1_nao.csr, dhrys1_nao.csr, dhrzs1_nao.csr and so on. Also controled by out_freq_ion and out_app_flag.
19801981

19811982
> Note: In the 3.10-LTS version, the file name is data-dHRx-sparse_SPIN0.csr and so on.
1982-
- **Default**: False
1983+
- **Default**: 0 8
19831984
- **Unit**: Ry/Bohr
19841985

19851986
### out_mat_ds
19861987

1987-
- **Type**: Boolean
1988+
- **Type**: Boolean \[Integer\](optional)
19881989
- **Availability**: *Numerical atomic orbital basis (not gamma-only algorithm)*
1989-
- **Description**: Whether to print files containing the derivatives of the overlap matrix. The format will be the same as the overlap matrix as mentioned in out_mat_dh. The name of the files will be dsrxs1.csr and so on. Also controled by out_freq_ion and out_app_flag. This feature can be used with calculation get_s.
1990+
- **Description**: Whether to print files containing the derivatives of the overlap matrix. The format will be the same as the overlap matrix as mentioned in out_mat_dh. The name of the files will be dsxrs1_nao.csr and so on. Also controled by out_freq_ion and out_app_flag. This feature can be used with calculation get_s.
19901991

19911992
> Note: In the 3.10-LTS version, the file name is data-dSRx-sparse_SPIN0.csr and so on.
1992-
- **Default**: False
1993+
- **Default**: False 8
19931994
- **Unit**: Ry/Bohr
19941995

19951996
### out_mat_xc
@@ -2004,12 +2005,12 @@
20042005

20052006
### out_mat_xc2
20062007

2007-
- **Type**: Boolean
2008+
- **Type**: Boolean \[Integer\](optional)
20082009
- **Availability**: *Numerical atomic orbital (NAO) basis*
2009-
- **Description**: Whether to print the exchange-correlation matrices in numerical orbital representation: in CSR format in the directory OUT.s.
2010+
- **Description**: Whether to print the exchange-correlation matrices in numerical orbital representation: in CSR format in the directory OUT.${suffix}. The name of the files will be vxcrs1_nao.csr and so on.
20102011

20112012
> Note: In the 3.10-LTS version, the file name is Vxc_R_spin$s and so on.
2012-
- **Default**: False
2013+
- **Default**: False 8
20132014
- **Unit**: Ry
20142015

20152016
### out_mat_l

docs/advanced/interface/TB2J.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,15 +122,17 @@ After the key parameter `out_mat_hs2` is turned on, the Hamiltonian matrix $H(R)
122122
suffix Fe
123123
```
124124

125-
specifies the suffix of the output, in this calculation, we set the path to the directory of the DFT calculation, which is the current directory (".") and the suffix to Fe.
125+
specifies the suffix of the output, in this calculation, we set the path to the directory of the DFT calculation, which is the current directory (".") and the suffix to Fe.
126+
127+
> **Note (ABACUS v3.9.0.25+):** Starting from ABACUS v3.9.0.25, the output format has changed to standard CSR format with filenames `hrs1_nao.csr`, `hrs2_nao.csr` (for nspin=2), and `srs1_nao.csr`. The parameter `out_mat_hs2` now supports optional precision control: `out_mat_hs2 1 8` (default 8 digits). TB2J v0.9.0+ is required to read the new format. For older TB2J versions, please use ABACUS v3.8.x or earlier.
126128
127129
#### 2. Perform TB2J calculation:
128130

129131
```bash
130132
abacus2J.py --path . --suffix Fe --elements Fe --kmesh 7 7 7
131133
```
132134

133-
This first read the atomic structures from th `STRU` file, then read the Hamiltonian and the overlap matrices stored in the files named starting from `data-HR-*` and `data-SR-*` files. It also read the fermi energy from the `OUT.Fe/running_scf.log` file.
135+
This first reads the atomic structures from the `STRU` file, then reads the Hamiltonian and overlap matrices. For ABACUS v3.9.0.25+, the matrices are stored in `hrs1_nao.csr`, `hrs2_nao.csr` (nspin=2), and `srs1_nao.csr` files. For older versions, they are in `data-HR-*` and `data-SR-*` files. It also reads the fermi energy from the `OUT.Fe/running_scf.log` file.
134136

135137
With the command above, we can calculate the $J$ with a $7 \times 7 \times 7$ k-point grid. This allows for the calculation of exchange between spin pairs between $7 \times 7 \times 7$ supercell. Note: the kmesh is not dense enough for a practical calculation. For a very dense k-mesh, the `--rcut` option can be used to set the maximum distance of the magnetic interactions and thus reduce the computation cost. But be sure that the cutoff is not too small.
136138

docs/advanced/interface/deeph.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,13 @@ The first stage is during the data preparation phase, where we need to run a ser
1616
out_mat_hs2 1
1717
```
1818

19-
Files named data-HR-sparse_SPIN`${x}`.csr and data-SR-sparse_SPIN`${x}`.csr will be generated, which contain the Hamiltonian and overlap matrices respectively in csr format. `${x}` takes value of 0 or 1, based on the spin component. More details on this keyword can be found in the [list of input keywords](../input_files/input-main.md#out_mat_hs2).
19+
**For ABACUS v3.9.0.25+:** Files named `hrs1_nao.csr`, `hrs2_nao.csr` (for nspin=2), and `srs1_nao.csr` will be generated in `OUT.${suffix}/` directory, containing the Hamiltonian and overlap matrices in standard CSR format. You can optionally specify precision: `out_mat_hs2 1 8` (default 8 digits).
20+
21+
**For ABACUS v3.8.x and earlier:** Files named `data-HR-sparse_SPIN${x}.csr` and `data-SR-sparse_SPIN${x}.csr` will be generated, where `${x}` takes value of 0 or 1 based on the spin component.
22+
23+
> **Note:** DeepH v1.0.0+ is required to read the new CSR format from ABACUS v3.9.0.25+. For older DeepH versions, please use ABACUS v3.8.x or earlier.
24+
25+
More details on this keyword can be found in the [list of input keywords](../input_files/input-main.md#out_mat_hs2).
2026

2127
The second stage is during the inference phase. After DeepH training completes, we can apply the model to predict the Hamiltonian on other systems. For that purpose, we also need the overlap matrices from the new systems, but no SCF calculation is required.
2228

0 commit comments

Comments
 (0)