Skip to content

wfc output on GPU failed #5969

@Cstandardlib

Description

@Cstandardlib

Describe the bug

When trying to output wfc on GPU devices, it failed to print the results.

[dell-Precision-7920-Tower:4070746] *** Process received signal ***
[dell-Precision-7920-Tower:4070746] Signal: Segmentation fault (11)
[dell-Precision-7920-Tower:4070746] Signal code: Address not mapped (1)
[dell-Precision-7920-Tower:4070746] Failing at address: (nil)
[dell-Precision-7920-Tower:4070747] *** Process received signal ***
[dell-Precision-7920-Tower:4070747] Signal: Segmentation fault (11)
[dell-Precision-7920-Tower:4070747] Signal code: Address not mapped (1)
[dell-Precision-7920-Tower:4070747] Failing at address: (nil)
[dell-Precision-7920-Tower:4070747] [ 0] [dell-Precision-7920-Tower:4070746] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7066c4842520]
[dell-Precision-7920-Tower:4070746] *** End of error message ***
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x75c5f3242520]
[dell-Precision-7920-Tower:4070747] *** End of error message ***
[dell-Precision-7920-Tower:4070748] *** Process received signal ***
[dell-Precision-7920-Tower:4070748] Signal: Segmentation fault (11)
[dell-Precision-7920-Tower:4070748] Signal code: Address not mapped (1)
[dell-Precision-7920-Tower:4070748] Failing at address: (nil)
[dell-Precision-7920-Tower:4070748] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7373a6242520]
[dell-Precision-7920-Tower:4070748] *** End of error message ***
[dell-Precision-7920-Tower:4070745] *** Process received signal ***
[dell-Precision-7920-Tower:4070745] Signal: Segmentation fault (11)
[dell-Precision-7920-Tower:4070745] Signal code: Address not mapped (1)
[dell-Precision-7920-Tower:4070745] Failing at address: (nil)
[dell-Precision-7920-Tower:4070745] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7a163d842520]
[dell-Precision-7920-Tower:4070745] *** End of error message ***

Expected behavior

Set out_wfc_pw = 1 and it works well, while out_wfc_r =1 would raise the above error.

In source/module_esolver/esolver_ks_pw.cpp:

    //------------------------------------------------------------------
    // 4) transfer data from GPU to CPU in pw basis
    // a question: the wavefunctions have been output, then the data transfer occurs? mohan 20250302
    //------------------------------------------------------------------
    if (this->device == base_device::GpuDevice)
    {
        castmem_2d_d2h_op()(this->psi[0].get_pointer() - this->psi[0].get_psi_bias(),
                            this->kspw_psi[0].get_pointer() - this->kspw_psi[0].get_psi_bias(),
                            this->psi[0].size());
    }

    //------------------------------------------------------------------
    // 3) output wavefunctions in pw basis
    //------------------------------------------------------------------
    if (PARAM.inp.out_wfc_pw == 1 || PARAM.inp.out_wfc_pw == 2)
    {
        std::stringstream ssw;
        ssw << PARAM.globalv.global_out_dir << "WAVEFUNC";
        ModuleIO::write_wfc_pw(ssw.str(), this->psi[0], this->kv, this->pw_wfc);
    }

even when we swap these two parts(copy & output), out_wfc_r still fails to work.

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

  • Verify the issue is not a duplicate.
  • Describe the bug.
  • Steps to reproduce.
  • Expected behavior.
  • Error message.
  • Environment details.
  • Additional context.
  • Assign a priority level (low, medium, high, urgent).
  • Assign the issue to a team member.
  • Label the issue with relevant tags.
  • Identify possible related issues.
  • Create a unit test or automated test to reproduce the bug (if applicable).
  • Fix the bug.
  • Test the fix.
  • Update documentation (if necessary).
  • Close the issue and inform the reporter (if applicable).

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugsBugs that only solvable with sufficient knowledge of DFTInput&OutputSuitable for coders without knowing too many DFT details

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions