Fix bug in dsp compute #6433

A-006 · 2025-08-04T09:04:26Z

Linked Issue

Fix #6429

What's changed?

Fixed a bug that occurred when compiling the DSP version and using CPU FFT in certain functions. In the examples 01_PW/057_PW_SO_IW,01_PW,01_PW/105_PW_W90,109_PW_PBE0,, CPU FFT functions were used even though the device was set to DSP. This caused segmentation faults in recip2real and real2recip functions. To resolve this, we have modified these functions to use the DSP device instead of the CPU device.

Fix bug on the zgemm which may cause segment fault.
Fix bug in sdft compute which may appear as below

Refactor

This update impacts seven modules: cal_ldos, cal_mkedf, wf_lcao, wf2rho, wannier90, overlap_pw, and stress_func_exx.
DSP Acceleration Support (6 modules):Refactored cal_ldos, cal_mkedf, wf_lcao, wf2rho, wannier90, and overlap_pw to enable DSP-accelerated computations,Implemented RAII-managed accelerator kernel allocation to reduce frequent malloc operations
GPU Device Support (1 module):Added native GPU device support to the stress_func_exx module

Attention

The recip_to_real template function is designed to be compatible with floating-point types and device types, enabling a heterogeneous framework. Moving forward, if device-specific or parameter-type requirements arise, we can directly utilize this templated function for seamless execution.
For pw_basis and pw_basis_k, template parameters are used to determine whether functions operate in real or reciprocal space. This approach clearly indicates whether calculations are performed in real or reciprocal space, improving code readability.
The main drawback is increased complexity in invocation, which demands higher programming expertise. When writing templated code, if the FFT function’s template parameter is T but operates on complex at runtime, programmers must fully understand the context. Type mismatches will be caught at compile time, preventing runtime errors.
This seems to mean that the DSP does not support grids that are powers of two (2^n grid points). We should express this in standard technical terms.
The DSP FFT could only use the kpar parallel way, but we can't use mpi solve the gamma only

Flying-dragon-boxing · 2025-08-15T07:19:49Z

stress_func_exx currently doesn't support heterogeneous computing. I'll fix this later.

source/source_basis/module_pw/module_fft/fft_dsp.cpp

source/source_esolver/esolver_ks_pw.cpp

A-006 · 2025-08-28T09:14:30Z

stress_func_exx currently doesn't support heterogeneous computing. I'll fix this later.

I have changed how FFT is used in the EXX implementation. Please remember to update the GPU code for EXX accordingly.

mohanchen · 2025-08-29T01:03:43Z

After discussion with @A-006 , I cannot accept this modifcation, which causes too many new defects. We decide to close this PR.

A-006 and others added 6 commits August 1, 2025 18:39

fix bug in dsp compute

8d6cbdf

Merge branch 'develop' into dsp1

1bee98c

add convulution for veff

dedc0e1

add change on pw_basis_k

40e73fd

revert convulution

848a52a

Merge branch 'deepmodeling:develop' into dsp1

684d868

mohanchen added the Bugs Bugs that only solvable with sufficient knowledge of DFT label Aug 6, 2025

A-006 and others added 6 commits August 6, 2025 10:07

free float alp and bet

aa7cab5

Merge branch 'develop' into dsp1

5a0c3e8

fix bug in dsp sdft

c6d3987

fix bug in dsp compute

ea9a258

fix bug in compute

d7589cd

add change for the gemm

c752b6e

Merge branch 'develop' into dsp1

26cd3f4

mohanchen reviewed Aug 20, 2025

View reviewed changes

source/source_basis/module_pw/module_fft/fft_dsp.cpp Show resolved Hide resolved

source/source_esolver/esolver_ks_pw.cpp Outdated Show resolved Hide resolved

A-006 and others added 3 commits August 20, 2025 20:18

Merge branch 'develop' into dsp1

6d15e1e

Merge branch 'develop' into dsp1

a879761

fix use way in exx_func

df73fb2

A-006 added 2 commits August 28, 2025 17:39

change func name

6da1f14

change func name

e0fcc85

mohanchen closed this Aug 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bug in dsp compute #6433

Fix bug in dsp compute #6433

Uh oh!

A-006 commented Aug 4, 2025 •

edited

Loading

Uh oh!

Flying-dragon-boxing commented Aug 15, 2025

Uh oh!

Uh oh!

Uh oh!

A-006 commented Aug 28, 2025

Uh oh!

mohanchen commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix bug in dsp compute #6433

Fix bug in dsp compute #6433

Uh oh!

Conversation

A-006 commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Linked Issue

What's changed?

Refactor

Attention

Uh oh!

Flying-dragon-boxing commented Aug 15, 2025

Uh oh!

Uh oh!

Uh oh!

A-006 commented Aug 28, 2025

Uh oh!

mohanchen commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

A-006 commented Aug 4, 2025 •

edited

Loading