Skip to content

Conversation

@dzzz2001
Copy link
Collaborator

Background

The original cal_gint_vlocal function had each thread duplicate an entire HR matrix to avoid data races, which could lead to high memory usage. This PR replaces that approach with locks to prevent data races, reducing memory consumption but at the cost of some computational efficiency. Testing shows that the calculation efficiency of cal_gint_vlocal may decrease by approximately 10%.

Perf comparison

Below are the comparison results of testing the si128 example under tests/performance/ on my computer.

old version(s) new version(s)
OMP=1 mpi=4 59.93 63.95
OMP=2 mpi=4 32.07 36.39
OMP=4 mpi=4 26.34 25.91

@dzzz2001 dzzz2001 requested review from dyzheng and mohanchen March 27, 2025 14:12
@mohanchen mohanchen added the Refactor Refactor ABACUS codes label Mar 28, 2025
@mohanchen mohanchen merged commit 2fead84 into deepmodeling:develop Mar 28, 2025
14 checks passed
Fisherd99 pushed a commit to Fisherd99/abacus-BSE that referenced this pull request Mar 31, 2025
…ing#6069)

* reduce the memory consumption of cal_gint_vlocal

* fix a bug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Refactor Refactor ABACUS codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants