Skip to content

Conversation

@Critsium-xy
Copy link
Collaborator

Initializing DSP hardware needs to use GlobalV, but I forgot to add the header in the previous PR. This PR quickly fixed the problem.

@mohanchen mohanchen added Refactor Refactor ABACUS codes GPU & DCU & HPC GPU and DCU and HPC related any issues labels Sep 10, 2025
@mohanchen mohanchen merged commit a7255f2 into deepmodeling:develop Sep 10, 2025
14 checks passed
Wuming-HUST pushed a commit to Wuming-HUST/abacus-develop that referenced this pull request Sep 12, 2025
@Critsium-xy Critsium-xy deleted the fix_dsp branch September 15, 2025 06:33
mohanchen added a commit that referenced this pull request Sep 20, 2025
…w. (#6490)

* Feature: add DFT-1/2 and shell DFT-1/2, currently only support PW esolver_ks_pw.

* Added Sep, Sep_Cell, and VSep to organize the self-energy potential of
DFT-1/2

* Added a new effective potential pot_sep for calculating the
self-energy potential

* Added initialization of the self-energy potential in the esolver_ks_pw
control flow

* Added the keyword SEP_FILES in the STRU file for reading self-energy
potential files

* Added the dfthalf_type keyword in INPUT to enable DFT-1/2 and shell
DFT-1/2

* Fix: Compilation error in DeepKS unit tests after adding DFT-1/2

* Fix: Add the additional files to Makefile.Objects

* Build(deps): Bump actions/setup-python from 5 to 6 (#6492)

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Refactor] Move hardware initializer out from esolver code (#6494)

* Move hardware initializer out from esolver

* Remove useless codes

* Remove finalize code out

* Feature: support NVTX profiling via timer_enable_nvtx flag (#6495)

* Feature: support NVTX profiling via timer_enable_nvtx flag
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Add timer_enable_nvtx section in markdown
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix: Use __USE_NVTX macro to avoid NVTX linking errors in tests.
Clarify in docs that timer_enable_nvtx parameter only takes effect on CUDA platforms.
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: Optimize Davidson by fusing operators, offloading CPU computation to GPU, and reducing memory transfers (#6493)

* Perf: Optimize Diago_DavSubspace with GPU operators by adding and fusing custom kernels.
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: reduce memory allocation and copy in Diago_DavSubspace::diag_zhegvx
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: Replace loop-based 2D copy and memset with memcpy_2d_op, memset_2d_op
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: use warp reduce instead of shared memory for better efficiency
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix compilation error
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix: resolve compile error with USE_ELPA=OFF + BUILD_TESTING=ON and switch to nvtx3 headers when CUDA_VERSION >= 12090 (#6497)

* Fix: switch to nvtx3 headers when CUDA_VERSION >= 12090
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix: resolve compile error with USE_ELPA=OFF + BUILD_TESTING=ON
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix dsp compilation problem (#6499)

* Fix: Fix crash in Debug build with multi-GPU due to forced cudaSetDevice(0) (#6498)

Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Removed the temporary variable DMRGint_full when transitioning from 2D block parallelism to serial in Hcontainer(develop) (#6489)

* delete tem Hcontainer to reduce memory usage

* simplify the compute code

* change DM2D_tmp to dm2d_tmp, use vector instead of new

* Update version to 3.9.0.14 (#6504)

* Refactor: Remove the GlobalC from sep_cell and vsep_cell

* Removed GlobalC::sep_cell and GlobalC::vsep_cell from GlobalC

* Integrated sep_cell into UnitCell

* Integrated vsep_cell into esolver_ks_pw

* Added empty constructors and destructors for Sep_Pot and Sep_Cell to
facilitate unit testing compilation

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Critsium <[email protected]>
Co-authored-by: Tianxiang Wang <[email protected]>
Co-authored-by: zgn-26714 <[email protected]>
Co-authored-by: Erjie Wu <[email protected]>
Co-authored-by: Mohan Chen <[email protected]>
kluonj pushed a commit to kluonj/abacus-develop that referenced this pull request Sep 28, 2025
kluonj pushed a commit to kluonj/abacus-develop that referenced this pull request Sep 28, 2025
…w. (deepmodeling#6490)

* Feature: add DFT-1/2 and shell DFT-1/2, currently only support PW esolver_ks_pw.

* Added Sep, Sep_Cell, and VSep to organize the self-energy potential of
DFT-1/2

* Added a new effective potential pot_sep for calculating the
self-energy potential

* Added initialization of the self-energy potential in the esolver_ks_pw
control flow

* Added the keyword SEP_FILES in the STRU file for reading self-energy
potential files

* Added the dfthalf_type keyword in INPUT to enable DFT-1/2 and shell
DFT-1/2

* Fix: Compilation error in DeepKS unit tests after adding DFT-1/2

* Fix: Add the additional files to Makefile.Objects

* Build(deps): Bump actions/setup-python from 5 to 6 (deepmodeling#6492)

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Refactor] Move hardware initializer out from esolver code (deepmodeling#6494)

* Move hardware initializer out from esolver

* Remove useless codes

* Remove finalize code out

* Feature: support NVTX profiling via timer_enable_nvtx flag (deepmodeling#6495)

* Feature: support NVTX profiling via timer_enable_nvtx flag
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Add timer_enable_nvtx section in markdown
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix: Use __USE_NVTX macro to avoid NVTX linking errors in tests.
Clarify in docs that timer_enable_nvtx parameter only takes effect on CUDA platforms.
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: Optimize Davidson by fusing operators, offloading CPU computation to GPU, and reducing memory transfers (deepmodeling#6493)

* Perf: Optimize Diago_DavSubspace with GPU operators by adding and fusing custom kernels.
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: reduce memory allocation and copy in Diago_DavSubspace::diag_zhegvx
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: Replace loop-based 2D copy and memset with memcpy_2d_op, memset_2d_op
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Perf: use warp reduce instead of shared memory for better efficiency
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix compilation error
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix: resolve compile error with USE_ELPA=OFF + BUILD_TESTING=ON and switch to nvtx3 headers when CUDA_VERSION >= 12090 (deepmodeling#6497)

* Fix: switch to nvtx3 headers when CUDA_VERSION >= 12090
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix: resolve compile error with USE_ELPA=OFF + BUILD_TESTING=ON
Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Fix dsp compilation problem (deepmodeling#6499)

* Fix: Fix crash in Debug build with multi-GPU due to forced cudaSetDevice(0) (deepmodeling#6498)

Signed-off-by:Tianxiang Wang<[email protected]>, Contributed under MetaX Integrated Circuits (Shanghai) Co., Ltd.

* Removed the temporary variable DMRGint_full when transitioning from 2D block parallelism to serial in Hcontainer(develop) (deepmodeling#6489)

* delete tem Hcontainer to reduce memory usage

* simplify the compute code

* change DM2D_tmp to dm2d_tmp, use vector instead of new

* Update version to 3.9.0.14 (deepmodeling#6504)

* Refactor: Remove the GlobalC from sep_cell and vsep_cell

* Removed GlobalC::sep_cell and GlobalC::vsep_cell from GlobalC

* Integrated sep_cell into UnitCell

* Integrated vsep_cell into esolver_ks_pw

* Added empty constructors and destructors for Sep_Pot and Sep_Cell to
facilitate unit testing compilation

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Critsium <[email protected]>
Co-authored-by: Tianxiang Wang <[email protected]>
Co-authored-by: zgn-26714 <[email protected]>
Co-authored-by: Erjie Wu <[email protected]>
Co-authored-by: Mohan Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GPU & DCU & HPC GPU and DCU and HPC related any issues Refactor Refactor ABACUS codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants